Substitute last 4 digits in second and third column

Question

I have a file as following:

2300
10     1112221234     111222123420231121PPPPD10+0000000850      ESIM
10     3334446789     333444678920231121PPPPD11+0000000950      RSIM
23

I want the outcome to be as following:

2300
10     1112222345     111222234520231121PPPPD10+0000000850      ESIM
10     3334447890     333444789020231121PPPPD11+0000000950      RSIM
23

I tried with the following code and was able to replace the last 4 digits in the second column and the last 4 digits before the date in the third column.
But it also removed extra spaces as well as alphabets/numbers from 11th digit onwards in the third column and got the following:

2300
10 1112222345 1112222345 ESIM
10 3334447890 3334447890 RSIM
23

awk '
BEGIN { FS=OFS=" " }
{if(length($2)>9 && length($3)>9)
   {$2 = substr($2,-10)
   $3 = substr($3,1,10) 
    for (i=2;i<=3;i++) {                                   
        str = substr($i, 1, length($i) - 4)                 
        for (j = length($i) - 3; j <= length($i); j++) {    
            str = str (substr($i, j, 1) + 1) % 10           
        }
        $i = str                                            
    }
}}
1' filename

score 10 · Accepted Answer · 2023-11-23 06:53:18Z

In GNU awk please try following GNU awk code. Written and tested with shown samples.

awk -v OFS="t" '
match($2,/(.*)([0-9])([0-9])([0-9])([0-9])$/,arr){
  if(arr[3]==9)     { val1=(arr[2] arr[3]) + 1                                }
  if(arr[5]==9)     { val2=(arr[4] arr[5]) + 1                                }
  if(val1 && !val2) { $2= arr[1] val1 arr[4]+1 arr[5]+1                       }
  if(val2 && !val1) { $2 = arr[1]  arr[2]+1 arr[3]+1  val2                    }
  if(val1 && val2)  {  $2 = arr[1] val1 val2                                  }
  if(!val1 && !val2){ $2 = arr[1] arr[2]+1 arr[3]+1 arr[4]+1 arr[5]+1         }
}
match($3,/(^.{6})([0-9])([0-9])([0-9])([0-9])(.*$)/,arr){
   if(arr[3]==9)     { val1=(arr[2] arr[3]) + 1                               }
   if(arr[5]==9)     { val2=(arr[4] arr[5]) + 1                               }
   if(val1 && !val2) { $3= arr[1] val1 arr[4]+1 arr[5]+1 arr[6]               }
   if(val2 && !val1) { $3 = arr[1]  arr[2]+1 arr[3]+1  val2 arr[6]            }
   if(val1 && val2)  { $3 = arr[1] val1 val2 arr[6]                           }
   if(!val1 && !val2){ $3 = arr[1] arr[2]+1 arr[3]+1 arr[4]+1 arr[5]+1 arr[6] }
}
1
' Input_file | column -t

jared_mamrotjared_mamrot 23.6k4 gold badges21 silver badges47 bronze badges · Accepted Answer · 2023-11-23 05:59:44Z

If you capture each ‘part of interest’ from columns $2 and $3, then increment the 4 digits, then use printf to print the lines, you can get your desired outcome, e.g.

awk 'BEGIN {
        FS = OFS = " "
}

{
        if (length($2) > 9 && length($3) > 9) {
                col2_first_part = substr($2, 0, 6)
                col2_4_digits = substr($2, 7, 4)
                col3_first_part = substr($3, 0, 6)
                col3_4_digits = substr($3, 7, 4)
                col3_last_part = substr($3, 11, length($3) - 10)
                printf "%st%s", $1, col2_first_part
                for (i = 1; i <= 4; i++) {
                        printf "%s", (substr(col2_4_digits, i, 1) + 1) % 10
                }
                printf "t%s", col3_first_part
                for (j = 1; j <= 4; j++) {
                        printf "%s", (substr(col3_4_digits, j, 1) + 1) % 10
                }
                printf "%st%sn", col3_last_part, $4
        } else {
                print
        }
}' filename
2300
10  1112222345  111222234520231121PPPPD10+0000000850    ESIM
10  3334447890  333444789020231121PPPPD11+0000000950    RSIM
23

Does that help?

Thanks. It is removing all the columns after the 4th one. The change will be needed in the second and third columns only but I do receive 5 to 10 columns, at times. How to keep the values from 4 – 10 columns as is ? — 24 mins ago

score 4 · Accepted Answer · 2023-11-23 17:49:35Z

Assumptions:

the string of interest (old) is the entire 2nd column
old is also the prefix of the 3rd column
old only shows up twice in a line (as 2nd column, as prefix of 3rd column)
lines of interest have 4 space-delimited columns
need to maintain spacing as it exists in the input

One awk idea:

awk '
NF==4 { old  = $2
        len  = length(old)
        new  = substr(old,1,len-4)
        for (i=len-3; i<=len; i++)
            new = new ((substr(old,i,1)+1) % 10)
        gsub(old,new)                              # replace both instances of "old" with "new"
      }
1
' filename

This generates:

2300
10     1112222345     111222234520231121PPPPD10+0000000850      ESIM
10     3334447890     333444789020231121PPPPD11+0000000950      RSIM
23

Thanks. This works but if the phone number in the third column (first 10 digits) are not the same as second column, then it does not replace the third column's digits (7,8,9 and 10) — 27 mins ago

Substitute last 4 digits in second and third column

Substitute last 4 digits in second and third column

3 Answers 3

3 Answers
3