A secondary source, the data was derived by a permutation process of different primary sources, but the problem is that the length of the permuted segment cannot be ascertained unless it is compared to the primary sources. It is not proper procedure to consider the output as primary until it can be ascertained that it is not plagiarism nor that it is hallucinating.
I think, ChatGPT is a secondary source and even tritary one, because it collects information from other sources (primary and secondary ones) to gives an answer. The researchers should be cautious about using it in their research, because there is no citation in information that ChatGPT gives them and they can't still distinguish valid information from invalid information!
ChatGPT, including the advanced GPT-4 version, is an AI model that synthesizes information rather than producing it through original research or direct observation. Thus, it functions more like a secondary source. It's important to understand that this AI doesn't have beliefs, opinions, or firsthand experiences. Instead, it generates outputs based on the patterns it learned during its training on a diverse range of internet text.
However, the distinction between primary and secondary sources in research can depend on the context:
As a Secondary Source: In most cases, if you're using information directly from GPT-4, it's a secondary source. This is because GPT-4 compiles, interprets, analyzes, and synthesizes information it has learned from its training data, rather than providing original, firsthand data or evidence.
As a Primary Source: In some specific contexts, GPT-4 could potentially be considered a primary source. For instance, if you're conducting research on AI-generated text, the outputs of GPT-4 would be primary data because they are original products of the AI system you're studying. If you use GPT-4 to generate responses to research questionnaires, those responses could be considered primary data for your study on AI responses to these questionnaires.
Remember, though, that even when GPT-4 is treated as a primary source in these specific contexts, it is still generating responses based on patterns it learned from secondary data, not from its own experiences or original research.
As with all sources, it's crucial to use critical thinking when evaluating the information you receive from GPT-4, and to cross-reference it with other sources for accuracy.
Generally speaking, for all the reasons that are mentioned above, ChatGPT is a secondary source. Yet it may be considered as a primary source if the researcher find an interest in exploring the additional information that is offered on ChatGPT - information regarding different versions, or meta-data fields such as: recurrence in questions, time & location, subjects commonly associated, etc.
My data collectors interviewed qualitative questions. They input the collected data to perpelty.AI. This AI tool derives a long note. The collectors submitted to me this long note which is secondary data from secondary sources. When I paste it to perplexity, it's retrieval result is the same. So it secondary source.
ChatGPT, including ChatGPT-4, is an AI model that extract and synthesize information based patterns it learnt by employing deep learning algorithms. Basically it can't create new ideas and concepts from what it knows or update based on conducting research or observation.
But, the information it provides can be considered as a primary or secondary source depending on the focus and context of the study. If you use ChatGPT generated information to study the quality and performance of ChatGPT itself, we can consider it as a primary source; Otherwise it is a secondary sources since the information it supplies is not original or firsthand, rather the information it generates is based on patterns it learnt from very large dataset.