ChatGPT, Translation, And Confidentiality — ‘We May Use The Data’

ChatGPT, Translation, and Confidentiality — ‘We May Use the Data’

153 views

Embed
Email

From

Username or Email (please add comma after each username or email)

Name	Email

Back

Menu 3

Eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo.

Slator

Uploaded on May 12, 2023

Category News & Politics

Open AI updates data usage policies after data breach, states content sent via API not used to train LLMs; ChatGPT not included in the update.

Category News & Politics

Comments

                     ChatGPT, Translation, and Confidentiality — ‘We May Use the Data’
                     ChatGPT, Translation, and Confidentiality — 
‘We May Use the Data’
There are no specific terms of use for Open AI’s consumer services, which is 
how the company classifies ChatGPT. The answer to what happens to content 
submitted to ChatGPT for translation is not found in the user terms of service, 
as most people would expect. Instead, it is found in Open AI’s Data Control 
Frequently Asked Questions (FAQs) and various linked documents.
The terms of use, or what the company calls data usage policies, govern its API 
services. These policies have changed since March 2023, which is when Open AI 
confirmed a data breach caused by a bug in ChatGPT’s source code. Compelled 
by public criticism for the breach, the company updated the policies to address 
data confidentiality and security concerns.
On Open AI’s end, the policy regarding data submitted by customers via its API is
that it will not be used to train or improve the models.
Open AI stated that a vulnerability in the Redis open-source library used by 
ChatGPT caused some active users’ chat history to become visible to other users
active at the same time. It also acknowledged that some payment information 
from premium users was leaked in March as well, but played down the potential 
consequences of this breach.
Enter at Your Own MT Risk
One of the documents linked in the data usage policies is a general statement of 
how data is used when transmitted across its consumer services: “When you use
our non-API consumer services ChatGPT or DALL-E, we may use the data you 
provide us to improve our models. You can switch off training in ChatGPT 
settings (under Data Controls) to turn off training for any conversations created 
while training is disabled …”
No distinction is made between these policies for the free and the paid 
subscription service, called ChatGPT Plus. The paid version just makes the 
service available in high demand, and it claims to be faster and to offer priority 
access to new features.
For API usage OpenAI states in its Data Usage Policies, “OpenAI will not use data 
submitted by customers via our API to train or improve our models, unless you 
explicitly decide to share your data with us for this purpose. You can opt-in to 
share data. Any data sent through the API will be retained for abuse and misuse 
monitoring purposes for a maximum of 30 days, after which it will be deleted 
(unless otherwise required by law).”
Language translation is just one of the many tasks ChatGPT is capable of 
performing, and there is no specific mention of content submitted for that 
purpose in the updated data usage policies. 
Upon searching in the Support area of Open AI’s site (called “Advice and answers
from the OpenAI Team”) to see if there are any specific mentions, users are 
redirected to the Data Controls FAQs.
Lock That Door
It is still early in the LLM evolution to see a large-scale use of its translation 
capabilities, but there have been some early integrations with translation 
management systems, which will depend on robust encryption and safety 
features like two-factor authentication to secure this data.
As an example of easy yet risky access, ChatGPT was also being used by 
Samsung employees for translation and other tasks until the company’s 
leadership prohibited use of the AI tool altogether in April 2023, citing security 
concerns.
Unfortunately for the general [non-paying] public, sensitive information cannot 
be considered secure when submitted to free translation services. In an example
that precedes LLM-served translation by a few years, after confidential texts 
that were submitted to Translate.com’s free service popped up in search engines
like Google and Microsoft in 2017, the company admitted that translations were 
“sent to our community to improve accuracy.”
Other providers, like DeepL Translate, have also made the news regarding 
questions around data management. DeepL has separate terms and conditions 
for the free MT product and the Pro version. Under a section titled “Processing 
of the submitted Texts” (a header that looks like unedited German into English 
MT), the terms for the free version state that content uploaded for translation, 
as well as the translations generated and post edited, are processed for an 
unspecified amount of time to train neural networks and translation algorithms.
What these MT providers have in common with ChatGPT, as far as translation is 
concerned, is that users are made responsible for their use of data, confidential 
or otherwise. In all cases, users are also allowing companies to use data for 
various purposes unless they opt out.

ChatGPT, Translation, and Confidentiality — ‘We May Use the Data’

Menu 3

Slator

Comments

ChatGPT, Translation, and Confidentiality — ‘We May Use the Data’

Recommended