A vowel is any vocal
sound that can be made continuously with no
blockage of the
oral cavity. So the
lips are open, and the
tongue is not touching the interior of the
mouth.
The "oral cavity" excludes the larynx. The different ways the vibrations and tensions in the larynx affect the quality of a vowel are called the phonation. This can be more complicated than I understand1, but the simplest phonation is called voicing. A sound is voiced when the vocal chords (= vocal cords), which are cartilages inside the larynx, vibrate. In all languages, without exception, most vowels are voiced. Most languages have only voiced vowels.
One essential quality of a vowel is tongue position. The tongue must be held somewhere in the mouth: if it is high up and towards the front, it makes the "ee" sound of machine; if it is high up and towards the back it makes the "oo" sound of rule; and if it is low in the mouth it makes the "ah" sound of father. The phonetic symbols for these sounds are [i], [u], and [a]2. All languages, without exception, use a variety of vowel sounds made by varying the tongue position3.
The simplest vowel system encountered in languages is a three-vowel one, with just these three [i u a]. Examples are Classical Arabic, most Australian Aboriginal languages, and Inuktitut. Actually the most extreme positions the centre of the tongue is capable of form not a triangle but an irregular quadrilateral, called the vowel quadrilateral. [i] is higher than [u] and much further forward than [a]. At the lowest possible extent, you can have a further-forward A-sound and a further-back one, forming a narrow base to the quadrilateral.
Midway in height between [i] and [a] is the [e] sound in bet, or Spanish E. Midway between [a] and [u] is [o] -- Spanish O is closer to midway than any English O sound is. This five-vowel system [i e a o u] is also fairly common in the world's languages. For example, ancient Etruscan and Latin had it (as does Spanish): which is why our alphabet has only those five letters with which to represent the twenty or so vowels of English, and comparable large numbers in French and German. But many languages have quite a lot more gradations between these positions. A vowel made in the middle of the quadrilateral is called a neutral vowel (or schwa).
Normally the tongue position is correlated with lip-spreading or lip-rounding. The lips are spread for [i], go down to neutral for [a], and become rounded going up the back to [u]. All languages have this correlation; but quite a few languages have additional sounds, front vowels like [i] and [e] but with rounded lips: the ü and ö sounds of German, written u and eu in French. A smaller number have back vowels with spread lips: there are no familiar European examples, but they occur in some Asian languages (Japanese U).
The mouth is more open for the low vowel [a] and more closed for the high vowels [i u]. In theory I suppose you could reverse this correlation but I don't know of any language that does so. High and low vowels are therefore often called close and open vowels. This is convenient, because of course "high" and "low" can also refer to the unrelated aspect of pitch. (Pitch is a form of phonation.)
Normally the velum or soft palate, which separates the oral cavity from the nasal cavity, is shut, blocking off the nose from the air stream. These are called oral vowels. All languages, without exception, have oral vowels. Most languages have nothing but oral vowels. But quite a few also have nasal vowels, where the air stream additionally vibrates through the nose: familiar examples are French an on in un. See my node nasalization for full details.
Other mechanisms that change vowel sounds are rare. American English curls the tip of the tongue back towards the palate in vowels such as ar er or.
In certain East African languages such as Maasai, whether the tongue root is extended or retracted makes a difference. In the West African language Twi this is correlated with a difference in quality (similar to bit vs beet). In English this bit ~ beet distinction is traditionally described as short vs long, but they may be of equal duration physically, and other features then mark the difference, including this advanced tongue root (ATR) feature. "Long" vowels are +ATR. This distinction was formerly analysed as "tense" vs "lax", but these terms don't describe it very well.
The pharynx, the throat space above the larynx, can be tightened. German is often pharyngalized. This is often accompanied by tongue root retraction but in theory the two are separate.
Acoustically, vowels are transmitted through the air in two bands of energy called formants. The lower band, called F1, increases in frequency as you go from close to open; the upper band, called F2, increases in frequency as you go from back to front. A vowel is called compact when F1 and F2 are close together (such as [a]) and diffuse when they are far apart.
1. In fact it is beyond the scope of this write-up to discuss any phonation but the default voicing.
2. Phonetic symbols for shades of sound are enclosed in square brackets. These symbols are why I said "ee" as in machine, not "ee" as in feet.
3. The Caucasian language Kabardian is sometimes described as having "only one vowel". The description of vowels in Kabardian is a problem, but in terms of my explanation above, Kabardian has multiple vowels like any other language: see North-West Caucasian for more detail.
Afterthought. I don't pay much attention to this "is Y a vowel?" business. Vowels are sounds, not letters. Some letters are used to represent vowels, but these incldde H, Y, G (paradigm), GH, W, R, L (talk), and probably a few others. It's not a question that has a definite answer, nor is it important.