How do you usally collect training data (point/polygons) for large scale (continental/global) sattelite data to be able to train in a mechine learning platform like GEE if you don't have readily available shapefiles? Thank you in advance.
There are basically three options for collecting training data for remotely-sensed classification. Option 1 is to use field-collected data where people go out with GPS and collect data. Option 2 is to photo-interpret training data from higher resolution imagery than what you are using to do the classification with. If you are using MODIS, Landsat, or Sentinel, for example, you might use data such as Worldview 2 or Planet Labs data to identify features visually. SERVIR has an interesting app called Collect Earth that they have used to automate the collection of training/validation data - https://collect.earth/about . Option 3 is to use other datasets, such as what Gomal Amin has suggested. Using other's models to train your models is a convenient shortcut but brings some disadvantages. With that approach you basically inherit all of the error of someone else's model plus the error in extrapolating to your own study area.