Microsoft PowerShell is a powerful scripting language and administrative framework for Windows, and one of the key elements that makes it so powerful is the pipeline — the assembly line of data and results that moves between and through cmdlets. In this piece, we’re going to talk about how you glue stuff together — or, more specifically, how you take the output or results from one PowerShell cmdlet and send it into another for further processing.
This is called piping, and the invisible tube that connects one cmdlet to another is the pipeline. The character that represents this all:
|
It’s known as the pipe and is the character above the backslash on your keyboard.
Tapping the pipeline
I think the best way to demonstrate the pipe and a pipeline is to do a simple example. But before we do that, I need to introduce two helpful features of PowerShell:
- format-list, which takes the output of almost any cmdlet and formats it in a list that explodes all relevant details
- format-table, which formats output in a nice text-based table
Format-list and format-table are absolutely dependent on the pipeline. You can’t just issue a format-list command — there has to be data to format in the first place. You get that data to the format-list cmdlet through the pipeline.
Remember our get-process cmdlet from my first article on PowerShell basics? Let’s practice pipelining by asking it to give us more information on the Google Chrome browser process formatted as a list:
get-process chrome | format-list
Here’s what we get back:
There are all of the Chrome processes on my machine right now, formatted as a list, with their properties exposed and expanded. We took the output of get-process chrome and piped it using the | character into the format-list cmdlet.
Filtering and limiting
One of the most common uses of pipelining is to take the output of one cmdlet and then filter it down into a certain subset of results; once you have filtered out the “noise” and you have your desired results, you then pipe that subresult set into another cmdlet to do some further magic.
This is where the where-object cmdlet comes in. Where-object is one of the filtering mechanisms in PowerShell, and you use it by putting together “where clauses.” Now, the formatting of where-object gets a little funky, so stay with me while I show it to you.
First off, you need to know some notations:
- { and } are curly brackets (in the same area as your backslash key, at least on U.S. keyboards) and they denote logic — when you are making comparisons or setting up criteria for filtering, you enclose the criteria in curly brackets. It’s kind of like the = sign in Excel that begins a formula, as in =sum, =average, and so on.
- $_ is shorthand for the current object. Think of this in your head as “whatever it is you’re working on.” For example, the stuff (actually, the .NET object — remember, all PowerShell output is a .NET object) you get from the get-process cmdlet can be referred to in the pipeline next as $_.
- -lt is less than (i.e., 3 < 4, 3 is less than 4)
- -le is less than or equal to
- -gt is greater than
- -ge is greater than or equal to
- -ne is not equal to
- -like is how you use text phrases for matching words and patterns
That might feel a little down in the weeds, but the good news is that this syntax, this way of formatting is found all over PowerShell. Learn it once and you’ll be good to go and can use it everywhere.
To use the where-object cmdlet, you simply type in where-object and then a left curly brace, and then you start adding properties to compare to. You finish it off with a right curly brace so PowerShell knows that is the end of your query.
You may recall that objects have properties. In my first article, we looked at get-command in general and then we wanted to find the number of cmdlets in total that were available to us on the system, so we queried a property of the get-command result object by using
(get-command).count
The logic in the where-object where clauses works the same way, except instead of putting (get-command) in parenthesis, we can simply use the notation
$_
I like to call it “that thing” when I’m thinking out loud. In other words, I want to compare that thing’s property to something else. You can use the same property syntax we talked about in my PowerShell introduction article — add a period and then the name of the property — within the where clause.
So to put together a where clause, I simply say “where-object” and then add a left curly brace, the “that thing notation,” a period, and then the name of the property. Then I add one of the comparisons I listed above, then whatever the value or number is that I’m comparing to, and then finish off with the right curly brace.
Let’s return to our processes example. Let’s say we want to find out all processes that used more than 1,000 seconds on all of the processors in the system. This can indicate a long-running process, or a process that’s gone out of control, or something else. We’ll use where-object to do the filtering, and get-process to get all of the initial process information from which we’ll do our filtering. So in the PowerShell window, we’d type
get-process | where-object {$_.
Now what do we type next? This is a great time to introduce the tab-ahead feature of the PowerShell window. Tab ahead lets you press the Tab key to cycle through all of the possible valid entries for whatever it is you’re typing. Right now, since we just entered the period, PowerShell is going to let us cycle through all of the available property names that we’d like to use. This is an incredibly valuable feature if you don’t know exactly what you’re looking for. Additionally, if you kind of know what you’re looking for, it’ll help you pinpoint it.
Here, since we know we’re after CPU times, we can type in CP and press Tab and we’ll see the PowerShell window add in the U for us, completing the identification.
So now we have
get-process | where-object {$_.CPU
Time to build the actual comparison logic. We want everything greater than or equal to (so we’d use the -ge tag) 1,000 CPU seconds. That will look like this:
-ge 1000
So now we’ve built
get-process | where-object {$_.CPU -ge 1000
And since we’re doing building our where clause, we just add the right curly brace
get-process | where-object {$_.CPU -ge 1000}
and hit enter.
What do we get?
Aha! A list of the two processes that have CPU seconds over 1,000. Guess who the culprit is here? You can read. :-)
What’s pretty cool is that within your comparison logic, you can use the -and parameter to add another set of criteria. For instance, in the above example, we could add another criterion that says we want the list to be filtered not only on CPU time as we discussed but also on the number of handles — we want all processes that meet the CPU time criteria and have handles greater than 1,000. That clause would look like this:
get-process | where-object {$_.CPU -ge 1000 -and $_.Handles -ge 1000}
See how that works? You can keep going on and on and on, but obviously at some point your filter becomes really tight. But that’s how you add criteria to where-object and how you construct the comparison clauses.
Getting lists of stuff
It’s time to introduce a new cmdlet to you. It’s called get-childitem, and it’s kind of like a Swiss army knife in the PowerShell world. There is a ton you can do with it.
At its core, get-childitem exists to get information on all of the things stored within something. If that something is the parent, then the things stored within it are child items, and thus the name of the cmdlet was born.
For example, a folder contains files, so when we point get-childitem at a folder, what do you think we’d get back?
A list of files!
But get-childitem is also capable of going up through subdirectories/subfolders (or down through them, depending on your point of view) and getting information about the child items within the folders of the original child items. This is called recursion and you can enable this parameter simply by including -recurse in your command. For example, try this on for size in your own copy of PowerShell:
get-childitem c:\windows -recurse
On my Core i5 machine at 2.9GHz with 32GB of RAM and a solid state hard drive, that command took a few minutes (!) to run. That’s because it’s enumerating every single file within the Windows folder, even if those files are deeper within subfolders under C:\Windows — all with one command. It’d take you a lot more mouse clicks to get through the Windows Explorer GUI to do the same thing. Are you beginning to see why they call it PowerShell?
Putting it together
So let’s use this power for good and not for amusement. Let’s say we have a server. Its main hard drive/storage array/default volume/whatever you want to call it is filling up. You suspect that some of your users are storing giant video files on this volume, and you want to find out if this is true. How could we find out in PowerShell?
Let’s assume we’re looking for files with the .mp4 extension, as that’s probably the most commonly used video file format. (In reality, you could search for any extension you wish, but this is a pretty decent example.)
We want to find the video files in c:\users because our users don’t have access to any other part of the disk. That’s a pretty standard setup you get out of the box in most Windows deployments if you are using folder redirection.
So we tell PowerShell we want information about files in c:\users
get-childitem c:\users -recurse
Now how do we search for the extensions? Remember get-childitem is going to return all of the information. What cmdlet do we use to filter out information? Oh yeah, where-object. And how do we get things to where-object? The pipeline!
get-childitem c:\users -recurse | where-object
Ah, we need the comparison logic now. We’ll add the left curly brace to begin the comparison expression, include the “that thing” notation and the period to access the properties of the .NET object that is the result of the get-childitem cmdlet.
get-childitem c:\users -recurse | where-object {$_.
Hmm. We’re looking for extensions, so I’ll type in Ext and see what comes up by tabbing through the options.
get-childitem c:\users -recurse | where-object {$_.Extension
Extension is one of them! There we go. So now I want the extension to equal .mp4, so I’ll use the -eq comparison (remember the list above) and then enclose .mp4 in quotes because, well, take my word for it for now—when you’re searching for text and not numbers, enclose the text in quotes.
get-childitem c:\users -recurse | where-object {$_.Extension -eq ".mp4"
Now just add the right curly brace and we are off to the races.
get-childitem c:\users -recurse | where-object {$_.Extension -eq ".mp4"}
Here’s what I get back:
Voila! I don’t have many video files on this system — just one — but if I ran this on my server, it would come back with quite a few. And other extensions would of course return different results. But now you know how to figure this sort of file management out with PowerShell and not with a billion right-clicks and shift-clicks inside the GUI.
Now, some of you might get an error. This is probably because you are not running your Windows PowerShell instance as an administrator. When you search C:\Users, PowerShell can only access the parts of C:\Users that your current user account has permission to access. By running as an administrator, you can generally search all of C:\Users.
Sorting
When you talk about filtering, the discussion naturally leads to sorting. Usually when you are filtering information, you’re trying to organize it, and sorting is another way of doing that.
We should probably start a new saying: “There’s a cmdlet for that!” Because there is a cmdlet for sorting — it is called sort-object. And with just a few exceptions, which generally would apply only in advanced scripting situations, sort-object relies on the pipeline too. After all, sort-object doesn’t — can’t — do much if there isn’t anything to sort.
Let’s look back to the previous file management example. When we searched for movie files contained within C:\Users, we got back a table with all of those files included. Sort-object can sort by any of the fields that are listed in that table. In this case, the size of the file is referred to in PowerShell as the length of the file. Sort-object takes the field names as parameters, exactly like get-process and other cmdlets do. (Remember, get-process chrome got us only processes with Chrome in the name.)
How do we get the results of that movie file search to the sort-object cmdlet? Via the pipeline! We would extend our pipeline into the sort-object cmdlet, like so:
get-childitem c:\users -recurse | where-object {$_.Extension -eq ".mp4"} | sort-object length
You can extend the pipeline out as far as you need to. Each step in the pipeline does something, and each step in the pipeline accepts the results of what happened just before it. The pipeline operates in sequence. Remember you don’t have to just pipe one command into another! You can keep going as long as it makes sense to you.
When you type in sort-object as part of this pipeline, again you can tab through the different options by which you can sort. Remember to tab through! It’s a great way to learn all that is available to you.
The last word
You’ve made it through a comprehensive introduction to the PowerShell pipeline. Today, we learned what the pipeline is and how to activate it using the | character, how to format results as a list and table with format-list and format-table, how to filter stuff, how to limit results using where-object, how get-childitem gets inside information about stuff that’s ... well ... inside, how to sort results using sort-object, and the availability of tabbing preview options to help you with your syntax within the PowerShell window.
Now you are equipped to tell all of those runaway processes to “pipe” down. (I’ll be here all week. Be sure to tip your waiter.)
Read this next: All about PowerShell providers and modules