linux - Merge two files with no pseudo-repetitions -


i have 2 text files file1.txt , file2.txt both contain lines of words this: fare word word-ed wo-ded wor ,

fa-re text uncial woded wor worded or this. word, mean succession of letters a-z possibly accents, symbol -. question is, how can create third file output.txt linux command line (using awk, sed etc.) out of these 2 files satisfies following 3 conditions:

  1. if same word occurs in 2 files, third file output.txt contains once.
  2. if hyphenated version (for example fa-re in file2.txt) of word in on file occurs in another, hyphenated version retained in output.txt (for example, fa-re retained in our example).

thus, output.txt should contain following words: fa-re word word-ed wo-ded wor text uncial

================edit========================

i have modified files , given output file well. try make sure manually there no differently hyphenated words (such wod-ed , wo-ded).

another awk:

!($1 in a) || $1 ~ "-" {      key = value = $1; gsub("-","",key); a[key] = value  } end { (i in a) print a[i] }  $ awk -f npr.awk file1.txt file2.txt text word-ed uncial wor wo-ded word fa-re 

Comments

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

mercurial graft feature, can it copy? -