Before you read this post, takeÂ a look at Paul Filkin’s excellent post on thisÂ topic:
Trados’Â defaultÂ “Catch all” regex worked fine for us until now. But when anÂ excel cells contains more complex HTML code, Trados will unfortunately start to look like this:
ThisÂ is no fun to work with. So I had to come up with something a bit more detailed and specify all HTML tags manually. Paul Â used the tag names <p> and </p> in his example, but this falls short when the tags contain class names or styles like so:
<pÂ class="ms-rtePosition-1" style="margin:0px 10px 0px 0px;"Â />
In thisÂ case, you need a regular expression that’s a bit more complex:
Start tag: <p(>|\s+[^>]*>) Â End tag: </p>
This will catch things as shown in theÂ example above. The regular expression is an OR statement, so it’s either a tag like <p>, or it’s <p>Â withÂ an empty space after and text. So things like <pre> will not fall into this regex.
I’ve added all the HTML tags that I need for the project Â and uploaded the settings file here.
That should save you from goingÂ through the same process and setting up everything manually â€“ simply download and import the settings by going Â toÂ Project Settings -> File Settings in SDL Trados Studio.
If you’d like to build the settings yourself , or you’re simply interested in theÂ HTML tags I’ve already built, here is aÂ screenshot: