An Extreme Learning Machine can be understood as a two-stage classifier: the projection and the classification (or regression) stage.
At the projection stage, random weights are used to project the data onto a new space (random space). This stage, as you mention, is not-trainable. On the other hand, we have the classification stage, which focuses on minimizing a loss function (i.e. MSE). At this stage, the weights are trainable, where we can use, for instance, an SGD or Least Squares.
ELM classifier is very similar to the Radio Basis Function classifier, where the main difference is the projection stage. While ELM employs random projections, RBF applies radio basis projection and in the classification stage, they perform the same process (as I mentioned above). Thereby, you can understand ELM by looking for RBF papers/tutorials.
Below an interesting link about RBF can help you to understand ELM