linux - Merge two files with no pseudo-repetitions -
i have 2 text files file1.txt
, file2.txt
both contain lines of words this: fare word word-ed wo-ded wor
,
fa-re text uncial woded wor worded
or this. word, mean succession of letters a-z possibly accents, symbol -
. question is, how can create third file output.txt
linux command line (using awk
, sed
etc.) out of these 2 files satisfies following 3 conditions:
- if same word occurs in 2 files, third file
output.txt
contains once. - if hyphenated version (for example
fa-re
in file2.txt) of word in on file occurs in another, hyphenated version retained in output.txt (for example,fa-re
retained in our example).
thus, output.txt should contain following words: fa-re word word-ed wo-ded wor text uncial
================edit========================
i have modified files , given output file well. try make sure manually there no differently hyphenated words (such wod-ed , wo-ded).
another awk:
!($1 in a) || $1 ~ "-" { key = value = $1; gsub("-","",key); a[key] = value } end { (i in a) print a[i] } $ awk -f npr.awk file1.txt file2.txt text word-ed uncial wor wo-ded word fa-re
Comments
Post a Comment