Next: , Previous: , Up: Comparing and Merging Files   [Contents][Index]

Overview

Computer users often find occasion to ask how two files differ. Perhaps one file is a newer version of the other file. Or maybe the two files started out as identical copies but were changed by different people.

You can use the diff command to show differences between two files, or each corresponding file in two directories. diff outputs differences between files line by line in any of several formats, selectable by command line options. This set of differences is often called a diff or patch. For files that are identical, diff normally produces no output; for binary (non-text) files, diff normally reports only that they are different.

You can use the cmp command to show the byte and line numbers where two files differ. cmp can also show all the bytes that differ between the two files, side by side. A way to compare two files character by character is the Emacs command M-x compare-windows. See Other Window in The GNU Emacs Manual, for more information on that command.

You can use the diff3 command to show differences among three files. When two people have made independent changes to a common original, diff3 can report the differences between the original and the two changed versions, and can produce a merged file that contains both persons’ changes together with warnings about conflicts.

You can use the sdiff command to merge two files interactively.

You can use the set of differences produced by diff to distribute updates to text files (such as program source code) to other people. This method is especially useful when the differences are small compared to the complete files. Given diff output, you can use the patch program to update, or patch, a copy of the file. If you think of diff as subtracting one file from another to produce their difference, you can think of patch as adding the difference to one file to reproduce the other.

This manual first concentrates on making diffs, and later shows how to use diffs to update files.

GNU diff was written by Paul Eggert, Mike Haertel, David Hayes, Richard Stallman, and Len Tower. Wayne Davison designed and implemented the unified output format. The basic algorithm is described by Eugene W. Myers in “An O(ND) Difference Algorithm and its Variations”, Algorithmica Vol. 1, 1986, pp. 251–266, http://dx.doi.org/10.1007/BF01840446; and in “A File Comparison Program”, Webb Miller and Eugene W. Myers, Software—Practice and Experience Vol. 15, 1985, pp. 1025–1040, http://dx.doi.org/10.1002/spe.4380151102. The algorithm was independently discovered as described by Esko Ukkonen in “Algorithms for Approximate String Matching”, Information and Control Vol. 64, 1985, pp. 100–118, http://dx.doi.org/10.1016/S0019-9958(85)80046-2. Unless the --minimal option is used, diff uses a heuristic by Paul Eggert that limits the cost to O(N^1.5 log N) at the price of producing suboptimal output for large inputs with many differences. Related algorithms are surveyed by Alfred V. Aho in section 6.3 of “Algorithms for Finding Patterns in Strings”, Handbook of Theoretical Computer Science (Jan Van Leeuwen, ed.), Vol. A, Algorithms and Complexity, Elsevier/MIT Press, 1990, pp. 255–300.

GNU diff3 was written by Randy Smith. GNU sdiff was written by Thomas Lord. GNU cmp was written by Torbjörn Granlund and David MacKenzie.

GNU patch was written mainly by Larry Wall and Paul Eggert; several GNU enhancements were contributed by Wayne Davison and David MacKenzie. Parts of this manual are adapted from a manual page written by Larry Wall, with his permission.


Next: , Previous: , Up: Comparing and Merging Files   [Contents][Index]