Using sed

As I’ve written about before (see engineering), I often use CalculiX for structural mechanics calculations of products made from composites. These often involve calculating lots of different combinations of materials, lay-ups and load cases. For convenience and for them being self-contained, each variant is usually in a different directory.

So when I find an improvement to e.g. a pre.fbd, Makefile or job.inp, I want modify all of the other instances as well. This is where sed (which stands for “stream editor”) comes into play.

Introduction

Why sed?

In the past I’ve applied these changes by hand. This is time-consuming and it is easy to forget to modify a file.

Using diff and patch is often not a good alternative. First, diff reports all changes, also trivial ones like file modification dates in the header. If you want to ignore some changes, you will have to manually edit the patch file. This is doable if you create diff output in the form of an ed script or in the default output format. It is more involved to do this in a unified diff. Second, diff output is bound to line numbers. If the line to be changed has moved, the patch will fail.

One of the strengths of sed is that it is more flexible on where it can apply changes. You can specify a line number, but you can also use e.g. a regular expression to search for a line.

It is possible to write scripts to apply changes to files in many different programming languages, especially those that come with regular expression libraries. Those will be in general much less succinct than a sed script.

Why this article?

Usually, when sed examples are shown, you will mostly just see the s-function in action. That is, replacement in a single line based on regular expressions.

While this is probably the most often used case, I want to show some of the other functions that it has.

Which sed?

To clarify, I am using not talking about GNU sed here, but the one that comes with FreeBSD. It is a superset of the POSIX.2 specification. The FreeBSD extensions are documented in the STANDARDS section of the manual, which you can find here. The manual is not very large; about 8 pages when rendered as a PDF. It is complete, but rather tersely written. Read it carefully.

Sed functions

In this case, I’ve put the commands in a file, so I can use that with the -f option. These commands are for modifying a script for the CalculiX preprocessor, cgx. So the contents are the cgx command language.

Context

The reason I’m using sed here is basically to make the preprocessor script parametric.

In the original script, I generated a relatively coarse mesh. The fixation points and the point where the loads would act were determined by looking at a picture of the mesh with the node numbers shown. In case of the fixation points those node numbers were simply added to a set named “fix”. Each load point got its node in a set with a single member.

However, if I were to make the mesh finer (to get a more accurate picture), the node numbers of the fixations and loads would surely change. And then I’d have to look at the mesh to determine those again. This is both time-consuming and error prone.

So I wanted to use the cgx command language, particularly the enq (enquire) command, to determine the numbers of the support nodes.

The code

Let me show the file in its entirety, and the we will go through the different commands.

/ulin green/a\
valu lldiv 16
/line L00[58]/s/8$/lldiv/
/valu qdiv/s/10$/20/
/seta fix n/c\
seta nodes n all\
valu rtol 0.01\
enq nodes f1 cx 0.9839 0 _ rtol i\
seta fix n f1\
enq nodes f2 cx 0.9839 90 _ rtol i\
seta fix n f2\
enq nodes f3 cx 0.9839 180 _ rtol i\
seta fix n f3\
enq nodes f4 cx 0.9839 270 _ rtol i\
seta fix n f4
/seta load6 n/c\
enq nodes load6 cx _ 180 0.823 rtol i
/seta load7 n/c\
enq nodes load7 cx _ 180 0.24775 rtol i

Let’s look at these in turn.

Explanation

/ulin green/a\
valu lldiv 16

This looks for a line that contains the text “ulin green”, and append the line “valu lldiv 16” after it. In this case lldiv is a parameter that determined in how many elements a line should be divided.

The part between / is an address, in the form of a regular expression.

Had I used “i” instead of “a”, the extra line would be inserted before the anchor.

/line L00[58]/s/8$/lldiv/

Look for a line that contains “line L005” or “line L008”. In that line, substitute an “8” at the end of the line by “lldiv”. This is what actually makes those lines parametric.

/valu qdiv/s/10$/20/

Look for lines that contain “valu qdiv”, and substitute a “10” at the end of the line by “20”. This is another parameter, for the amount of divisions in a rotating sweep.

/seta fix n/c\
seta nodes n all\
valu rtol 0.01\
enq nodes f1 cx 0.9839 0 _ rtol i\
seta fix n f1\
enq nodes f2 cx 0.9839 90 _ rtol i\
seta fix n f2\
enq nodes f3 cx 0.9839 180 _ rtol i\
seta fix n f3\
enq nodes f4 cx 0.9839 270 _ rtol i\
seta fix n f4

The “c” function changes (replaces) a line. In this case, a line containing “seta fix n” is replaced by the text on the next lines. All but the last line following the function end with a \ followed by a newline. This is a signal to sed that this is a multi-line text. The line of fixed node numbers is replaced by several enquire commands that add nodes to the set “fix”.

The last two commands are simple one-line substitutions.

For comments, please send me an e-mail.

Inplace editing

← Use pygmentize as a colorized cat Extracting glyphs from an OpenType file →

Roland's homepage