You are viewing a single comment's thread.

view the rest of the comments →

0
0

[–] djdevin ago  (edited ago)

yes but depends on the usage, if you are comparing files to find a match (lets say trying to find a file in a database of millions of files) - you hash the file then search for files that have the first N characters in the hash. it's "fuzzy" but performs better within reasonable certainty

if you're checking something like a password or something more secure like a key, yes you need to match the entire string because it's much easier to brute force and get a partial string correct.

so to answer the original question it's really about accepting your level of risk vs. performance/convenience benefits.

for total accuracy you have to check the whole string, and even that is only a mathematical probability that you have the correct file. as someone mentioned already, it is now possible to have two files generate the same md5 sum (md5 is broken - known as "collision") however extremely improbable.

Source: Git (sha1 but still relevant) - https://git-scm.com/book/en/v2/Git-Tools-Revision-Selection#Short-SHA-1

Generally, eight to ten characters are more than enough to be unique within a project. As an example, the Linux kernel, which is a pretty large project with over 450k commits and 3.6 million objects, has no two objects whose SHA-1s overlap more than the first 11 characters.