I have a file as following:
2300
10 1112221234 111222123420231121PPPPD10+0000000850 ESIM
10 3334446789 333444678920231121PPPPD11+0000000950 RSIM
23
I want the outcome to be as following:
2300
10 1112222345 111222234520231121PPPPD10+0000000850 ESIM
10 3334447890 333444789020231121PPPPD11+0000000950 RSIM
23
I tried with the following code and was able to replace the last 4 digits in the second column and the last 4 digits before the date in the third column.
But it also removed extra spaces as well as alphabets/numbers from 11th digit onwards in the third column and got the following:
2300
10 1112222345 1112222345 ESIM
10 3334447890 3334447890 RSIM
23
awk '
BEGIN { FS=OFS=" " }
{if(length($2)>9 && length($3)>9)
{$2 = substr($2,-10)
$3 = substr($3,1,10)
for (i=2;i<=3;i++) {
str = substr($i, 1, length($i) - 4)
for (j = length($i) - 3; j <= length($i); j++) {
str = str (substr($i, j, 1) + 1) % 10
}
$i = str
}
}}
1' filename
0
3 Answers
In GNU awk
please try following GNU awk
code. Written and tested with shown samples.
awk -v OFS="t" '
match($2,/(.*)([0-9])([0-9])([0-9])([0-9])$/,arr){
if(arr[3]==9) { val1=(arr[2] arr[3]) + 1 }
if(arr[5]==9) { val2=(arr[4] arr[5]) + 1 }
if(val1 && !val2) { $2= arr[1] val1 arr[4]+1 arr[5]+1 }
if(val2 && !val1) { $2 = arr[1] arr[2]+1 arr[3]+1 val2 }
if(val1 && val2) { $2 = arr[1] val1 val2 }
if(!val1 && !val2){ $2 = arr[1] arr[2]+1 arr[3]+1 arr[4]+1 arr[5]+1 }
}
match($3,/(^.{6})([0-9])([0-9])([0-9])([0-9])(.*$)/,arr){
if(arr[3]==9) { val1=(arr[2] arr[3]) + 1 }
if(arr[5]==9) { val2=(arr[4] arr[5]) + 1 }
if(val1 && !val2) { $3= arr[1] val1 arr[4]+1 arr[5]+1 arr[6] }
if(val2 && !val1) { $3 = arr[1] arr[2]+1 arr[3]+1 val2 arr[6] }
if(val1 && val2) { $3 = arr[1] val1 val2 arr[6] }
if(!val1 && !val2){ $3 = arr[1] arr[2]+1 arr[3]+1 arr[4]+1 arr[5]+1 arr[6] }
}
1
' Input_file | column -t
If you capture each ‘part of interest’ from columns $2 and $3, then increment the 4 digits, then use printf
to print the lines, you can get your desired outcome, e.g.
awk 'BEGIN {
FS = OFS = " "
}
{
if (length($2) > 9 && length($3) > 9) {
col2_first_part = substr($2, 0, 6)
col2_4_digits = substr($2, 7, 4)
col3_first_part = substr($3, 0, 6)
col3_4_digits = substr($3, 7, 4)
col3_last_part = substr($3, 11, length($3) - 10)
printf "%st%s", $1, col2_first_part
for (i = 1; i <= 4; i++) {
printf "%s", (substr(col2_4_digits, i, 1) + 1) % 10
}
printf "t%s", col3_first_part
for (j = 1; j <= 4; j++) {
printf "%s", (substr(col3_4_digits, j, 1) + 1) % 10
}
printf "%st%sn", col3_last_part, $4
} else {
print
}
}' filename
2300
10 1112222345 111222234520231121PPPPD10+0000000850 ESIM
10 3334447890 333444789020231121PPPPD11+0000000950 RSIM
23
Does that help?
1
-
Thanks. It is removing all the columns after the 4th one. The change will be needed in the second and third columns only but I do receive 5 to 10 columns, at times. How to keep the values from 4 – 10 columns as is ?
– Sam24 mins ago
Assumptions:
- the string of interest (
old
) is the entire 2nd column old
is also the prefix of the 3rd columnold
only shows up twice in a line (as 2nd column, as prefix of 3rd column)- lines of interest have 4 space-delimited columns
- need to maintain spacing as it exists in the input
One awk
idea:
awk '
NF==4 { old = $2
len = length(old)
new = substr(old,1,len-4)
for (i=len-3; i<=len; i++)
new = new ((substr(old,i,1)+1) % 10)
gsub(old,new) # replace both instances of "old" with "new"
}
1
' filename
This generates:
2300
10 1112222345 111222234520231121PPPPD10+0000000850 ESIM
10 3334447890 333444789020231121PPPPD11+0000000950 RSIM
23
1
-
Thanks. This works but if the phone number in the third column (first 10 digits) are not the same as second column, then it does not replace the third column's digits (7,8,9 and 10)
– Sam27 mins ago