CPU scheduling is the problem of deciding which computer process in the ready queue (in other words, which particular programs need some processing and are ready and waiting for it) is to be allocated to the CPU for processing. It is a fundamental problem in operating systems in terms of minimizing the wait for the user when he or she simply wants to execute a particular set of tasks.
There are several approaches for solving this problem. The simplest of all the solutions (and the most naive) is the first-come first-served algorithm. With this scheme, whichever process requests CPU time first is allocated the CPU first. This is easily implemented with a FIFO queue for managing the tasks; as they come in, they're put on the end of the queue. As the CPU finishes each task, it pops it off the start of the queue and heads on to the next one.
The average waiting time for this technique, though, is often quite long, and that is the drawback of this approach. Let's take three processes that arrive at the same time in this order:
Process CPU Time Needed (ms)
------- -------------------
P-1 100
P-2 2
P-3 2
If we serve these guys up in order, we get this order of execution:
P-1 starts at time 0 and ends at time 100.
P-2 starts at time 100 and ends at time 102.
P-3 starts at time 102 and ends at time 104.
Thus, P-2 has to wait 100 milliseconds to start and P-3 has to wait 102 milliseconds. The average waiting time here is (0 + 100 + 102) / 3 = 67.3 milliseconds. But if we just switch the order a bit to P-2, then P-3, then P-1, then the average waiting time becomes (0 + 2 + 4) / 3 = 2 milliseconds. Clearly, the average waiting time under a purely first-in first-out system is going to often be poor if one task is significantly longer than the others.
So, the first idea that comes about after seeing this is the idea of having the shortest task go first, or shortest-job-first scheduling. This, obviously, would be similar to the idea above, except that as each job comes in, it is sorted into the queue based on size.
In concept, this idea makes a lot of sense, and one can easily prove that it is optimal in terms of giving the lowest average waiting time for a set of processes. The difficulty with this algorithm, though, is knowing which incoming process is indeed shorter than another. There is no way to know the length of the next CPU burst (the amount of time a given process will need), so this type of scheduling is largely impossible. It does have some use for scheduling long-term jobs, when you know exactly how long they will run and you are attempting to plan them out over a long period of time, but in terms of doing this on the fly with incoming processes, it is largely impossible.
Another solution to this problem, and one that somewhat solves the problem above, is the idea of a priority-scheduling method. This means that each process has a priority associated with it, and as each process hits the queue, it is sorted in based on its priority so that processes with higher priority are dealt with first. This is essentially the shortest-job-first idea, except in that case the priority is the inverse of the time the process needs. A variation on this theme is multilevel queue scheduling, where each task is designated one of several different levels of importance, and the most important ones always run first.
This is quite effective at scheduling jobs for a CPU, assuming that the operating system has the sense to properly assign priorities to the tasks. There is only one real problem with this method, and that itself has a clever solution. The problem occurs when the operating system gives a particular task a very low priority, so it sits in the queue for an exorbitant amount of time, not being dealt with by the CPU. If this process is something the user needs, there could be a very long wait. As a result, many operating systems use a technique called aging, in which a low priority process slowly gains priority over time as it sits in the queue. That way, no matter how low the priority of the process is, it will eventually be dealt with.
If your primary concern is a small standard deviation between the times that it takes for your processes to finish, meaning that you don't mind waiting a long time for each process provided they all finish in roughly the same time, another approach to solving the CPU scheduling problem, round-robin scheduling, may be the ticket. In this approach, a time slice is defined, which is a particular small unit of time. In each time slice, the process being actively dealt with by the CPU runs until the end of the time slice; if that process is done, it is discarded and the next one in the queue is dealt with. However, if the process isn't done, it is halted and put at the back of the queue, then the next process in line is addressed during the next time slice.
This method vastly slows down short processes, because they have to share the CPU time with other processes instead of just finishing up quickly. Thus, a good rule of thumb is to make sure that the length of the time slice is such that 80% of your processes can run in one time slice. This way, only the longer processes wrangle over CPU time and the short jobs aren't stuck at the back of the queue.
There are many approaches to solving the problem of CPU scheduling; likely, your operating system uses one of or a combination of the above methods for scheduling the CPU time of the programs you are running.