+1 vote
in Programming Languages by (7k points)

The index() function to find the index of elements in a list works perfectly fine when the size of the list is small. If the list contains millions of elements and you have to find the indices of 1000s of elements, the index() function is too slow. Numpy functions in1d() and isin() are faster than the index(), but the sequence of the elements are not maintained by these two functions. So, what is the quickest way to find the index of elements in a very large list?

1 Answer

0 votes
by (13.5k points)

You can create a dictionary to store the index and value of elements present in the big list. Then, you can search the dictionary to find the indices of the desired elements. Searching a dictionary is extremely fast. Check the following example:

>>> sl
[8, 6, 4, 5, 0, 2, 7, 1, 3, 9]
>>> bg
[33, 15, 28, 41, 27, 30, 46, 3, 25, 17, 2, 36, 29, 20, 22, 8, 47, 39, 7, 19, 11, 5, 12, 42, 24, 16, 9, 43, 13, 45, 21, 37, 18, 0, 31, 44, 23, 1, 32, 40, 14, 34, 48, 6, 4, 35, 38, 26, 49, 10]
>>> tmp_dict = {}
>>> for i,v in enumerate(bg):
...     tmp_dict[v] = i
...
>>> idx=[]
>>> for e in sl:
...     idx.append(tmp_dict[e])
...
>>> idx
[15, 43, 44, 21, 33, 10, 18, 37, 7, 26]

...