Tong Guo What kind of emergent properties would you expect in the image domain? Youre asking in comparison to language models, but I dont quite see the connection.
For large language models there are several aspects of improvisations and development are still underway with respect to the current trends and technologies. Also, there are several limitations and benchmarks yet to be satisfied asserting to truthfulness, bias, etc.
Check this article for reference:
[2206.07682] Emergent Abilities of Large Language Models (arxiv.org)
Preprint Emergent Abilities of Large Language Models
Same applies to the computer vision and image process-based learning. The deep learning models are adaptive but still needs improvement. There is this Image GPT (openai.com)
https://openai.com/research/image-gpt
There have been several experiments and projects related to vision processing and detection using LLMs. For eg: Preprint Towards Language Models That Can See: Computer Vision Throug...
So yes, if the computer power is increased and predictive accuracies are achieved, computer vision might achieve it.
That is an interesting question. Emergent capability is the term used to describe the phenomenon of large language models (LLMs) like ChatGPT exhibiting unexpected and novel abilities that are not explicitly trained for, such as solving math problems, generating code, or interpreting emojis. These abilities are thought to arise from the massive amount of data and parameters that LLMs use, as well as the feedback mechanisms that fine-tune them.
Computer vision is the field of AI that deals with processing and understanding visual information, such as images and videos. Computer vision models can perform tasks such as object detection, face recognition, scene segmentation, and image generation. These models also use large amounts of data and parameters, as well as feedback mechanisms, to learn from visual inputs.