wiki:Howto/RemoveDuplicateLinesFromTextFiles

Remove Duplicate Lines From Text files

awk '{if (!( $0 in lines)) print $0; lines[$0]=1;}' in.file

This is a quick way to clean up a text file by removing exact duplicates without changing the order of lines. It does something that can't be done by:

sort in.file | uniq

Using this test data in in.file:

This
is
a
another
another
test
another
This
a
is
test

This awk script will remove duplicates:

awk '{if (!( $0 in lines)) print $0; lines[$0]=1;}' in.file
This
is
a
another
test

Whereas:

sort in.file | uniq
a
another
is
test
This