share
Unix & LinuxHow to add spaces in certain columns of a file in Linux
[+4] [6] Rozer
[2020-04-18 12:36:12]
[ ubuntu text-processing awk sed command ]
[ https://unix.stackexchange.com/questions/580901/how-to-add-spaces-in-certain-columns-of-a-file-in-linux ]

I have a text file containing 1000 lines in this format:

001122 abc def ghi
334455 xyz aaa bbb
667788 ccc ccc ddd

How can I convert it into this format using a Linux command by adding spaces to certain columns?

00 11 22 abc def ghi
33 44 55 xyz aaa bbb
66 77 88 ccc ccc ddd
(2) Is certain column always first column or are you looking for a solution that lets you specific which column to space by it's number? - Ed Morton
@EdMorton I've assumed the latter in my answer, otherwise it would be a trivial question with dozens of duplicates already on the SE network. - Alex Stragies
You may want to look at unix.stackexchange.com/help/someone-answers, and also take the tour: unix.stackexchange.com/tour (you'll receive a badge when you've done so). - Kusalananda
Are work orders accepted on this site? Isn't any demonstrated effort required? "Don't just dump the problem statement" - Peter Mortensen
From Meta: "... Show the research that you’ve already done. As you already saw in How do I ask a good question?, tell us what you’ve already found and why it didn’t meet your needs)" - Peter Mortensen
[+5] [2020-04-18 16:30:24] Kusalananda

Naively but straight forward:

$ sed 's/\(..\)\(..\)\(..\)/\1 \2 \3/' file
00 11 22 abc def ghi
33 44 55 xyz aaa bbb
66 77 88 ccc ccc ddd

That is, match and collect the three first groups of two characters on each line, and space them out by inserting spaces in the replacement string.

Fancy but requires thinking:

$ sed 's/../ &/3; s/../ &/2' file
00 11 22 abc def ghi
33 44 55 xyz aaa bbb
66 77 88 ccc ccc ddd

This first expression replaces the 3rd match of .. on each line with a space followed by whatever those .. matched. Then again, but for the 2nd match.


1
[+4] [2020-04-18 12:43:43] GMaster

A simple sed command is all that is needed (change filename with the acutal file):

sed -E 's|([0-9]{2})([0-9]{2})([0-9]{2})[[:blank:]]*(.*)|\1 \2 \3 \4|g' filename

If you want to change the source file (filename) in place, pass in the -i option:

sed -i -E 's|([0-9]{2})([0-9]{2})([0-9]{2})[[:blank:]]*(.*)|\1 \2 \3 \4|g' filename

Explanation:

([0-9]{2}) matches groups of 2 digits 3 times

(.*) matches everything else which is all the letters

[[:blank:]]* matches space characters including tabs

\1 through \4 are matched groups

Note that this will only work with GNU sed. Almost all mainstream Linux distributions come with GNU Linux. If you are using macOS, your sed is BSD sed, unless your installed GNU sed available as gsed.


Although technically correct, why bother using vertical bar when the more conventional forward slash would suffice? - DannyNiu
(1) I find it easier to use |. Although it does not make any difference in this instance, when you have strings like https://, you won't have to bother escaping https:\/\/ if you use |. - GMaster
Thank you for the quick replay but the output looks like this spaces between digits is correct but extra +7 spaces added before the text when i backspace once it becomes proper any help ``` 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd ``` - Rozer
@Rozer It looks like there are some tabs or spaces that is not visible in the input text your provided. Anyway, I have updated my answer. Try and see if it helps. - GMaster
Thank you @GMaster problem solved - Rozer
@Kusalananda using -i without a backup file suffix is also GNU only AFAIK. - Ed Morton
@EdMorton I don't know about the other BSDs, but OpenBSD sed groks -i with no option-argument. - Kusalananda
idk, I just see a lof of "why is this happening" sed questions answered with "because -i requires an argument with that sed" and I thought it was the default sed on MacOS which I think is BSD - Ed Morton
@Kusalananda See for example the serendipitously timed stackoverflow.com/q/61301662/1745001 :-) - Ed Morton
@EdMorton Well, non-standard option. What does seems to always work is to give the option a filename suffix. It would be best if those users read the manual for their particular tool, IMHO. - Kusalananda
2
[+4] [2020-04-18 14:56:24] Ed Morton

Using any awk in any shell on every UNIX box and letting you specify which column to change and independent of the characters in that column:

$ awk -v c=1 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file
00 11 22 abc def ghi
33 44 55 xyz aaa bbb
66 77 88 ccc ccc ddd

$ awk -v c=2 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file
001122 ab c def ghi
334455 xy z aaa bbb
667788 cc c ccc ddd

$ awk -v c=3 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file
001122 abc de f ghi
334455 xyz aa a bbb
667788 ccc cc c ddd

3
[+3] [2020-04-18 16:20:25] Alex Stragies

A Generic version for any number/position of spaces in awk:

awk -v s='2,4' '{f=!split(s,a,",");for(i in a){r="^.{"a[i]+f++"}";gsub(r,"& ")}}1'
00 11 22 abc def ghi
⋮

A more powerful version, where other characters than space can be inserted:

spacers(){
  awk -v s="$1" '{f=!split(s,a,/[^*0-9]*/);split(s,p,/[*0-9]*/);
                  for(i in a){if(""==b=a[i])continue;
                    r="^.{"(b!="*"?b+f++:length($0))"}";
                    gsub(r,"&"p[i+1])}}                          1' $2;}

That way, you can do e.g.:

spacers '0|2 4 6|10@yahoo.com |* |' file
|00 11 22| abc@yahoo.com | def ghi |

which is great for creating org-mode tables and piping directly to clipboard.

Note: The shell-function also accepts data through STDIN.

(Earlier versions of this answer contained a generic awk-solution, that used sed for the final replace)


4
[+3] [2020-04-18 23:05:08] iruvar

If the input data is exactly as depicted, GNU cut is an option. Note that the --output-delimiter has to be explicitly set to a space. This makes for a very rigid solution unlike some of the other answers, lacking both the flexibility to deal with variable string length in the first field and the ability to designate an arbitrary field to operate on.

cut -c1-2,3-4,5- --output-delimiter=' ' <file
00 11 22 abc def ghi
33 44 55 xyz aaa bbb
66 77 88 ccc ccc ddd

5
[+2] [2020-04-18 15:22:50] bu5hman

Being completely lazy about typing here,

sed -E "s/([0-9]{2})/\1 /g; s/ +/ /g" file1

Put a space after every pair of digits and then reduce the multiple spaces to a singleton.

Or, perhaps even lazier

sed 's/./& /4;s/./& /2' file1

6