There are two interrelated dimensions in operant learning. One concerns whether responding is strengthened or weakened over time (reinforcement and punishment, respectively), and the other concerns whether this effect is accomplished by adding or removing a stimulus, event or condition contingent upon the response (and relative to the antecedent conditions).
Thus, positive reinforcement occurs when a stimulus is added contingent upon a response, and the resulting effect over time is to increase the frequency (or other dimension) of the response (e.g., pressing a lever produces a bit of food for an animal on a restricted feeding schedule; adding [+] the food increases response frequency).
Negative reinforcement is said to occur when the response is strengthened over time as a result of the removal of a stimulus contingent upon that response (e.g., pressing a lever terminates an electric shock; subtracting [-] the shock increases response frequency).
Punishment describes the process where a response is diminished over time, with both negative and positive varieties. Positive punishment reduces the frequency (or other dimension) of responding by adding a stimulus contingent upon the response (e.g., adding an electric shock when a lever is pressed under a food reinforcement schedule; adding [+] the shock decreases the rate of responding).
Negative punishment occurs when a stimulus is removed following the response and the response diminishes in frequency (e.g., in a differential reinforcement of low rates paradigm, any response occurring prior to a minimal interval of time resets the clock and delays reinforcement, thereby reducing higher rates of responding; subtracting [-] the food decreases responses with low inter-response times).
"Reward" is synonymous with "reinforcement" but carries additional connotations since, in English at least, it also implies subjective feelings of pleasure, on the one hand, and social arrangements wherein material gifts or bounties are offered on the other. These latter conditions are irrelevant to an empirical and materialist analysis of behavior.
No, that’s wrong. There’s no notion of ``positive" or ``negative" data. There are review articles on this subject, it would be a good idea to study them.
In various situations, positive values are closely associated with rewards. For instance, in the context of reinforcement learning, actions or behaviours that bring an agent closer to its goal or enhance its performance are often rewarded with positive outcomes. Conversely, negative values are commonly linked to penalties or costs. In reinforcement learning, actions or behaviours that deviate from the desired outcome or hinder the agent's performance are typically met with negative rewards.