I have ERP data from two languages and look for the earliest ERP differences between the word conditions (probably also pseudowords but less important): I would like to describe differences occurring in relation to components already know and not simply give times without a reference to previous works on word recognition. If there any paper-reviews describing responses in relation to word processing or even simply established nomenclature to describe auditory components, I will be happy to know about it. I'm really not familiar with auditory ERPs. This would be of real help.
You should see the N1-P2 component and central-to-anterior electrodes (around Fz), and with average reference specifically, you should see inversion (i.e. opposite voltage values) and the temporal sites on each side of the head.
There are reviews of the early, obligatory responses that appear to tones that Arild describes, e.g., Ruhnau et al., (2011). Maturation of obligatory auditory responses and their neural sources: evidence from EEG and MEG. NeuroImage, 58(2), 630–9. doi:10.1016/j.neuroimage.2011.06.050.
When you present words, there are additional later components. The Phonological Mapping Negativity (PMN) is specific to auditory presentation, e.g., Connolly, J. F., & Phillips, N. A. (1994). Event-Related Potential Components Reflect Phonological and Semantic Processing of the Terminal Word of Spoken Sentences. Journal of Cognitive Neuroscience, 6, 256–266.
Then there are the N400s, which have been interpreted as markers of semantic integration but may reflect semantic processing more broadly. Try Duncan et al., (2009). Event-related potentials in clinical research: Guidelines for eliciting, recording, and quantifying mismatch negativity, P300, and N400. Clinical Neurophysiology, 120, 1883–1908. doi:10.1016/j.clinph.2009.07.045 for a good overview.
Note that the inversion that Arild mentioned only happens with temporal sites, e.g., T7 / T8.