Visual Studio: Regex Replace
Posted in Articles on March 17th, 2011 by PyrokaReg what?
“A regular expression (regex or regexp for short) is a special text string for describing a search pattern. You can think of regular expressions as wildcards on steroids.” – Regular-Expressions.info
That sums up regular expressions better than I ever could (and if you want to learn them, Regular-Expressions.info is probably the best site to start from.) Of-course, anyone who knows regexs knows how amazingly useful they are however, C++ doesn’t directly support them (boost has an implementation, and I’m not sure I want to see the black magic behind that) and most game programmers I know don’t know/use them. And this is understandable, they’re kinda slow It has to match how many strings? and can be very complex and can be generally summed up equally well with another quote:
“Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems” – Jamie Zawinski, in comp.lang.emacs
But that is mainly a response to the effort of having to maintain them, as like all code, regular expressions tend to evolve over time and can grow to unwieldy sizes fairly fast (also it takes a while to learn how to read them).
So what is this Regex Replace thing?
Ok, time to fire up visual studio, go on, I’ll wait….. Ok now load up a project your working on (I know you have one) highlight some code and press Ctrl-H (short-cut for Find And Replace) and you should have something similar to this:
Now the bit you’re looking for is the bottom check-box, the one that says ‘Use:’ tick that and the drop-down box below it becomes active, and you have two options, you can either use ‘Wildcards’ or ‘Regular Expressions’. In this case Wildcards are a watered-down version of Regular Expressions that has fewer special characters, but we don’t want no watered down nonsense, so select the full Regular Expressions.
We can now type Regular Expressions into each of the text-boxes (the arrows are the sides of the ‘Find what:’ and ‘Replace with’ text-boxes will now be highlighted, and clicking them will give you a list of special characters. (Note that if you’re used to ‘normal’ Regular Expressions this may be slightly confusing as Visual Studio uses some different special characters).
Now I’m going to take you through a real-life example where this saved me a lot of time just the other day I had the following code:
public uint TypeToConerBytes(int type)
{
// Packing order (top-left | top-right | bottom-right | bottom-left)
switch (type)
{
case 0:
return (0 < < 24 | 0 << 16 | 0 << 8 | 0);
case 1:
return (1 << 24 | 1 << 16 | 1 << 8 | 1);
case 2:
return (0 << 24 | 1 << 16 | 0 << 8 | 1);
case 3:
return (0 << 24 | 0 << 16 | 1 << 8 | 1);
case 4:
return (1 << 24 | 0 << 16 | 1 << 8 | 0);
case 5:
return (1 << 24 | 1 << 16 | 0 << 8 | 0);
case 6:
return (0 << 24 | 0 << 16 | 0 << 8 | 1);
case 7:
return (0 << 24 | 0 << 16 | 1 << 8 | 0);
case 8:
return (0 << 24 | 1 << 16 | 0 << 8 | 0);
case 9:
return (1 << 24 | 0 << 16 | 0 << 8 | 0);
case 10:
return (1 << 24 | 1 << 16 | 0 << 8 | 1);
case 11:
return (1 << 24 | 1 << 16 | 1 << 8 | 0);
case 12:
return (0 << 24 | 1 << 16 | 1 << 8 | 1);
case 13:
return (1 << 24 | 0 << 16 | 1 << 8 | 1);
case 14:
return (1 << 24 | 0 << 16 | 0 << 8 | 1);
case 15:
return (0 << 24 | 1 << 16 | 1 << 8 | 0);
default:
throw new NotImplementedException();
}
}
Now, as you can see from my useful comment, this code represents a number of different combinations of 4 bytes backed into an unsigned integer, it just so happened, that as the feature I was implementing progressed, I needed to change the packing order of the bytes from ‘(top-left | top-right | bottom-right | bottom-left)’ to ‘(top-left | top-right | bottom-left | bottom-right)’ (switching the last two bytes).
You could, of-course do this manually, and it would take a while and you’d probably miss a few or add a typo or some other subtle bug, the better option, in my opinion at-least, is you use what I consider to be one of the most useful features of Visual Studio, Regex Replace.
Ok, paste that code into a document in Visual Studio, and hit Ctrl-H, in the find box you want to type:
\| {:z} \<\< 8 \| {:z}\);
That may look complicated, so I’ll walk you through it, the first thing we want to match is the ‘|’ character, however, ‘|’ is a special character so needs to be escaped (hence the ‘\’ before it), then we want to match a space and then an integer, which in our case could be either 1 or 0 (‘:z’ is the special character to match any integer) we also want to tag that number (explained more below) so we wrap that in the {}’s. Next is another space, followed by the left-shift operator (again ‘<’ is a special character so they both need to be escaped) followed by another space, then another ‘|’ character, a space, an integer (also tagged) then the closing bracket (another special character) and finally the semicolon. (Whew, that was a lot of explanation for a 25 character string). This ‘find’ string will match all the instances of the last two bytes we are packing in each of the case statements.
Now for the replace box, enter this:
\2 << 8 | \1);
That’s it, a simpler string this time so it should take less explaining; the first character ‘\2′ is special, this time the ‘\’ does not denote an escape character but is in-fact part of the special character. Remember the {}’s that were used to ‘tag’ certain values (in our case, the two integers)? Well this statement inserts the value of the second tagged value, so if the second ‘:z’ matched a 0, it would put a 0, if it matched a 1 it would put a 1, etc. Then we have a space, followed by the left-shift (most special characters allowed in the find box would make no sense in the replace box and are thus not allowed, so there’s no need to escape the ‘<’ this time) followed by a space, then the ‘|’ (again, not a special character in replace), another space, then the the first tagged value, and finally the ‘)’ (not special) and semicolon.
Hit ‘Replace All’ and you will have successfully switched the packing order of the last two bytes for each of the statements.
Conclusion
As you can see, this is a powerful technique however it takes some practice, at first it will take you a while to form the correct statements (especially if you are generally unfamiliar with regular expressions) but with time and practice you will have another neat trick which will improve your work-flow and decrease bugs (over doing it manually). Since learning that Visual Studio can do this I’ve used it only a few times, but the amount of time it has saved me in each of those instances is the reason I regard this as such a powerful feature to know.
