Python speed comparison regex vs startswith vs str [: num]

Introduction

There are several ways to determine a prefix match for a string that is Ptyhon. Among them, the following three typical speed comparisons are performed.

Regular expressions
startswith
str[:word_length] == word

Measuring method

environment

Implement in the following execution environment

item	value
Python Version	3.8.2
OS	Ubuntu 20.04

program

Check the operation based on the following program. The roles of each variable and each function are as follows. Change the variable according to the characteristics you want to measure.

variable/function	Description
time_logging	Decorator for measuring time
compare_regex	Compare each of the list of argument strings with a regular expression
compare_startswith	Each of the list of argument strings`startswith`Compare by method
compare_str	The first string in each of the list of argument strings is`target_word`Compare if equal to
target_word	Character string to be compared
match_word	`target_word`String prefix that matches
not_match_word	`target_word`String prefix that does not match
compare_word_num	Total number of strings to compare
compare_func	Function to measure
main	Function to be called

import re
import time


def time_logging(func):
    def deco(*args, **kwargs):
        stime = time.time()
        res = func(*args, **kwargs)
        etime = time.time()
        print(f'Finish {func.__name__}. Takes {round(etime - stime, 3)}s.', flush=True)
        return res

    return deco


@time_logging
def compare_regex(compare_words):
    pattern = re.compile(f'^{target_word}')
    for word in compare_words:
        if pattern.match(word):
            pass


@time_logging
def compare_startswith(compare_words):
    for word in compare_words:
        if word.startswith(target_word):
            pass


@time_logging
def compare_str(compare_words):
    length = len(target_word)
    for word in compare_words:
        if word[:length] == target_word:
            pass


target_word = f'foo'
match_word = f'{target_word}'
not_match_word = f'bar'
compare_word_num = 100_000_000
match_rate = 50
compare_func = compare_regex


def main():
    compare_words = []
    for index in range(compare_word_num):
        if index % 100 <= match_rate:
            compare_words.append(f'{match_word}_{index}')
        else:
            compare_words.append(f'{not_match_word}_{index}')

    compare_func(compare_words)


if __name__ == '__main__':
    main()

Parameters

Since the tendency of execution speed may change depending on the length of the character string to be compared, Measure the execution speed of compare_regex, compare_startswith, and compare_str when target_word is changed to 5, 10, 50, 100, and 500 characters, respectively.

Measurement

Unit (seconds)

function\word count	5	10	50	100	500
compare_regex	11.617	12.044	16.126	18.837	66.463
compare_startswith	6.647	6.401	6.241	6.297	6.931
compare_str	5.941	5.993	4.87	5.449	8.875

Consideration

In terms of speed, it should be implemented with starts with or str [: word_length] for any number of characters. The most recommended is starts with, which is the least affected by the string to be compared. I also like it the most in terms of readability.

Python speed comparison regex vs startswith vs str [: word_length]