I am currently doing some work for the Church of England, moving and improving the repository of liturgy to their new website. There are a lot of files involved so using shortcuts for sorting out data and lists is vital.
Removing unneeded items from a list
Today I was working with an export of a section of the old site, counting pages and creating a checklist of them to work through: in the export I had, there was a duplicate of every page which included the full site content, not just the body html of that page. These items all had “.aspx?v=text.html” in the string, and I didn’t need them to be included on my list.
COUNTIF and text in Libreoffice
I used Libreoffice on Fedora linux. My list of files was running down column A: in Column B1 I put
I then clicked the little blue box in the corner of B1 to run the formula down to the end of my list (B351). This puts a 1 in the B cell if the list item in the A cell contains the “.aspx?v=text.html”
Seeing all zero’s?
If you are seeing a 0 in cell B where you can see that the text string IS in fact present in cell A, then you have the same problem I had! I found a solution in a forum someplace. Go to:
Tools > Options > LibreOffice Calc > Calculate
And uncheck Search criteria = and < > must apply to whole cells.
Then click okay, and go back to your sheet. You need to recalculate, so click the blue “autofill” button in the corner of the top cell again.
You should now see a 1 if the string is present and a 0 if it is not.
In a separate incident I found i needed to check the Wild-cards box in that same settings page, so if your formula is still giving you 0’s incorrectly, try that.
Sort by column b
Then, select all your data – the list and the 0’s and 1’s and go to Data > Sort. In Sort key 1 you need your 0’s and 1’s column (in my case Column B). Click okay. Then, all your 1’s and 0’s will be grouped together. You can select and delete either group depending on your case. In my case, I deleted all the items which included “.aspx?v=text.html” – those where the duplicates I wanted to ditch.