I have been writing a lot recently about little obstacles I have come across while helping the Church of England move content from their old sites to their new ones. Making a list of all the content to be moved is often the first job, but many of the older/proprietary platforms don’t have the ability to export a list of posts or pages as a .csv for example. Usually copy pasting a list of the files from the back end of the sites is the only option, but then I need to strip out all the junk that copies over with it: edit and delete buttons for example.
Today I needed to remove a lot of date/time stamps, and found the quickest way to do this was once more a regex find and replace in a text editor (Gedit). I had already removed various words from the text, such as “Edit” and “Delete”.
I used:
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]
Which basically means any number from 09 in the time stamp format used. Here is an example:
2013-03-16 10:00:00
All removed just fine. I then used a different regex to remove all the double line breaks.
Here is another example with a different date format:
Here the Regex was:
[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]
And the dates were:
27/01/2016
Thanks for this – I’ve been trying to remove the time stamps from a chess game, in one go, without success. First 5 moves:
1. e4 {[%clk 00:05:00]} Nc6 {[%clk 00:05:00]}
2. d4 {[%clk 00:05:02]} e5 {[%clk 00:04:56]}
3. d5 {[%clk 00:05:05]} Nd4 {[%clk 00:04:59]}
4. Be3 {[%clk 00:04:54]} c5 {[%clk 00:04:59]}
5. c3 {[%clk 00:04:51]} Qa5 {[%clk 00:05:00]}
The hope is to create a pgn file, I did succeed with that but only by using a long-winded and tedious method of find and replace using Gedit.
I could kiss you, I am here four years later, but this works in Microsoft Word as well – you have saved me HOURS of work on transcripts I need to process. Many thanks!!