Tutorial – Conditional Formatting in Excel

Hello Digital Humanists, today I’ll talk about how to analyze data and clean small datasets with Conditional Formatting in Microsoft Excel.

This is one of Excel’s best features, in my humble opinion. I’m not sure how different/available this will be in Google Sheets, but if you’ve got Excel you should be using this IMO. First, you’re going to want a dataset with enough values that you can comfortably scroll through them all in a short time. Otherwise this method is going to be much less useful, not to mention the amount of lag it will cause in Excel. Here I’ve made a small dataset with some values, as well as two “bad” values at the end.

In order to use Conditional Formatting, you’re going to want to start by simply hilighting the cells you want to format (in this case the Gender column). After that, click the button that says Conditional Formatting, and press “New Rule…”

Change the style to Classic, as 2-Color Scale is kind of bad.

Now change the first option to “Format only cells that contain”

These should be the final settings to make some set of cells red. In this case, we’ll make everything with “F” red.

And there we go, all of the cells with “F” are formatted as we saw earlier!

Now let’s do something different. The default options available are red, yellow, and green, but what if we wanted blue? You can do that too, by going to “Format with” and selecting “custom format…”

Excel really likes using elipses for some reason. Don’t ask me!

Here we can select the Font and the Fill of what we want, which I’ll change to a dark and light blue respectively.

After that, we can see a preview of the formatting in the rule settings, to make sure it looks right.

Looks good to me, let’s hit OK.

And there we go, the Gender column is formatted. These colors aren’t great together, but it’s not a huge deal to change them.

This change was mostly for a viewer’s convenience, it didn’t really help us do much with the data. Conditional Formatting is capable of doing that too though, so let’s look at a scenario like that. First, in the “Random Number” column I’ve simply got a random 4-digit number for each person. You can think of this like a Student ID. If anyone’s number is outside of the range of 0-9999, that’s probably an error and something we want to remove before doing anything with this data. To check if we have this problem with Conditional Formatting, just hilight the column and select a New Rule.

We don’t want to look for text, since these are numbers, we want to hilight certain numerical values. We’ll make that setting change here.

Change the second box to “greater than”, and make the number 9999

Double check your settings, and click OK to activate the rule.

Hey look, we found an error! Let’s get rid of that row to clean our data.

There might be issues with word columns too. Let’s make sure that, for example, there’s no duplicate names. To do so, we’ll make a special rule that checks for duplicates in a region. Hilight the names column and select “Duplicate Values…” from the options as shown.

This will make a new rule with all the settings pregenerated correctly to find duplicates. Unless you want to change the look of the formatting for some reason, just click OK. Once again, we found a duplicate! It looks like someone made a double entry for me.

Let’s go ahead and delete that row, just like we did with the Random Number error. Once we’ve done that, we’ve cleaned the data …at least in respect to these two possible issues. Be careful with applying too many rules, as it can cause the program to lag if you make too many.

I’m not sure how many people still use Excel, given Microsoft switching Office to a subscription service and Google Sheets’ rising popularity, but if anyone does this is a neat trick I use all the time. I hope you found this useful, or at least interesting.

EvelynS

7 Comments

  1. Great tutorial, Evelyn! I found it really easy to understand and enjoyed following along. The “Duplicate Values” function is one I had never heard of before and I definitely plan on using that in the future. If you don’t mind me asking, why is the 2-color scale bad?

  2. Wow this is incredibly helpful! I didn’t know how to do this in Excel or Google Sheets, but its a really powerful way to categorize variables and get a sense of possible relationships before making graphs (which is the main reason I use Excel). It’s also much easier than I thought it would be. This tutorial was really easy to understand and you added a perfect amount of visuals vs. text!

  3. This was really handy! My dad works in finances and I’m realizing he probably has tried explaining this process to me a couple of times, now I get it.

  4. Thanks for your post, Evelyn! I thought that this was a really interesting and powerful tool to explore! As was mentioned in class, this tool seems to also be a favorite amongst accountants, and my mom is an accountant, so I’ll have to ask her about how she uses this in her work!

  5. Hi Evelyn, great tutorial! I often use Excel to organize my dataset from labs and do some quick calculations, but I had never figured how to do any fancy stuff like conditional highlighting, so I really appreciated your step-by-step. I definitely will be using it in the future, and I wish I had known about it when I was working on my midterm for this class since I had to do a lot of sorting and organizing of the dataset. I got a little stuck on the step where you said to go to “format with…” and “select format…” (my excel tool bar looks different from the one in the screenshot) so maybe adding a couple words on where to find the “format with…” button would be helpful. But other than that, everything was very easy to follow along. Great tutorial and super helpful!

  6. Great tutorial Evelyn! A really useful feature to know in Excel. It would be even more helpful if you provided a download of the sample dataset you used so readers could follow along.

  7. Hi Evelyn, thanks for the tutorial, your explanations were really easy to follow! I’ve struggled using Excel in the past, and this function looks like it’ll really come in handy with the type of datasets I use.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.