This page demonstrate the
FineDiff class (as in “fine granularity diff”) I wrote – starting from scratch – to generate a lossless (won't eat your line breaks), compact opcodes string listing the sequence of atomic actions (copy/delete/insert) necessary to transform one string into another (thereafter referred as the “From” and “To” string). The “To” string can be rebuilt by running the opcodes string on the “From” string. The
FineDiff class allows to specify the granularity, and up to character-level granularity is possible, in order to generate the smallest diff possible (at the potential cost of increased CPU cycles.)
$opcodes = FineDiff::getDiffOpcodes($from_text, $to_text /* , default granularity is set to character */);
// store opcodes for later use...
$to_text can be re-created from
$opcodes as follow:
$to_text = FineDiff::renderToTextFromOpcodes($from_text, $opcodes);
Rendered Diff: Show Deletions only All Insertions only
The PHP-based engine of
Text_Diff is forced, in order to meaningfully compare results with PHP-based
Text_Diff is naturally geared toward line-level granularity, and to compute diff for a higher granularity (sequences, words, characters), line break characters (\n, \r) are replaced in order to avoid having
Text_Diff from eating our line breaks — so extra steps are required.
FineDiff is natively better equipped to generate diff at granularity higher than line levels. An example of this is that using the above built-in sample text, for word and character-level granularity,
FineDiff roughly executes in 25 ms and 30 ms, respectively, while
Text_Diff roughly executes in 75 ms and 6.5 seconds, respectively (on my development computer, a run of the mill Intel i5 core desktop computer).
If you wish to comment on this page, head to the associated blog entry: FineDiff, a character-level diff algorithm in PHP