AI and Open Government Data Assets Request for Information, 27411-27413 [2024-08168]
Download as PDF
Federal Register / Vol. 89, No. 75 / Wednesday, April 17, 2024 / Notices
II. Approval of Minutes
III. Committee Discussion
IV. Public Comment
V. Adjournment
Dated: April 12, 2024.
David Mussatt,
Supervisory Chief, Regional Programs Unit.
[FR Doc. 2024–08181 Filed 4–16–24; 8:45 am]
BILLING CODE P
COMMISSION ON CIVIL RIGHTS
Notice of Public Meeting of the Nevada
Advisory Committee to the U.S.
Commission on Civil Rights
U.S. Commission on Civil
Rights.
ACTION: Announcement of virtual
business meeting.
AGENCY:
Notice is hereby given,
pursuant to the provisions of the rules
and regulations of the U.S. Commission
on Civil Rights (Commission) and the
Federal Advisory Committee Act, that
the Nevada Advisory Committee
(Committee) to the U.S. Commission on
Civil Rights will hold a virtual business
meeting via ZoomGov at 1:30 p.m.
Pacific on Friday, April 19, 2024. The
purpose of the meeting is to finalize a
post-report activity that involves
reviewing the Committee’s Op-Ed.
DATES: Friday, April 19, 2024, from 1:30
p.m.–2:00 p.m. PT.
ADDRESSES:
Zoom Webinar Link to Join (Audio/
Visual): https://www.zoomgov.com/s/
1619807338?pwd=L01ycEp1cG0vNS9je
WxyOFpSc09RQT09.
Telephone (Audio Only): Dial (833)
435–1820 USA Toll Free; Meeting ID:
161 980 7338.
FOR FURTHER INFORMATION CONTACT: Ana
Fortes, Designated Federal Officer, at
afortes@usccr.gov or (202) 519–2938.
SUPPLEMENTARY INFORMATION:
Committee meetings are available to the
public through the registration link
above. Any interested member of the
public may listen to the meeting. An
open comment period will be provided
to allow members of the public to make
a statement as time allows. Per the
Federal Advisory Committee Act, public
minutes of the meeting will include a
list of persons who are present at the
meeting. If joining via phone, callers can
expect to incur regular charges for calls
they initiate over wireless lines,
according to their wireless plan. The
Commission will not refund any
incurred charges. Callers will incur no
charge for calls they initiate over landline connections to the toll-free
telephone number. Closed captioning
lotter on DSK11XQN23PROD with NOTICES1
SUMMARY:
VerDate Sep<11>2014
17:10 Apr 16, 2024
Jkt 262001
will be available for individuals who are
deaf, hard of hearing, or who have
certain cognitive or learning
impairments. To request additional
accommodations, please email Angelica
Trevino, Support Specialist, at
atrevino@usccr.gov at least ten (10) days
prior to the meeting.
Members of the public are entitled to
make comments during the open period
at the end of the meeting. Members of
the public may also submit written
comments; the comments must be
received in the Regional Programs Unit
within 30 days following the meeting.
Written comments may be emailed to
Ana Fortes (DFO) at afortes@usccr.gov.
Records generated from this meeting
may be inspected and reproduced at the
Regional Programs Coordination Unit
Office, as they become available, both
before and after the meeting. Records of
the meetings will be available via
www.facadatabase.gov under the
Commission on Civil Rights, Nevada
Advisory Committee link. Persons
interested in the work of this Committee
are directed to the Commission’s
website, https://www.usccr.gov, or may
contact the Regional Programs
Coordination Unit at atrevino@
usccr.gov.
Agenda
I. Welcome, Roll Call, and
Announcements
II. Review Op-Ed
III. Vote on Op-Ed
IV. Adjournment
Exceptional Circumstance: Pursuant
to 41 CFR 102–3.150, the notice for this
meeting is given less than 15 calendar
days prior to the meeting due to the
availability of staff and the Committee.
Dated: April 12, 2024.
David Mussatt,
Supervisory Chief, Regional Programs Unit.
[FR Doc. 2024–08182 Filed 4–16–24; 8:45 am]
BILLING CODE P
DEPARTMENT OF COMMERCE
[Docket No. 240410–0103]
RIN 0690–XD001
AI and Open Government Data Assets
Request for Information
ACTION:
Notice, request for information.
The U.S. Department of
Commerce is committed to advancing
transparency, innovation, and the
responsible use and dissemination of
public data assets, including for use by
data-driven AI technologies. To this
end, we are pleased to issue this
SUMMARY:
PO 00000
Frm 00006
Fmt 4703
Sfmt 4703
27411
Request for Information (RFI) to seek
valuable insights from industry experts,
researchers, civil society organizations,
and other members of the public on the
development of AI-ready open data
assets and data dissemination standards.
DATES: Comments must be received on
or before July 16, 2024.
ADDRESSES: All electronic public
comments on this action, identified by
Regulations.gov docket number DOC–
2024–0007, may be submitted through
the Federal e-Rulemaking Portal at
www.regulations.gov. The docket
established for this request for comment
can be found at www.regulations.gov,
DOC–2024–0007. Click the ‘‘Comment
Now!’’ icon, complete the required
fields, and enter or attach your
comments.
FOR FURTHER INFORMATION CONTACT:
Please direct questions regarding this
Notice to Victoria Houed at
ContactOUSEA@doc.gov with ‘‘AIReady Open Data Assets RFI’’ in the
subject line, or if by mail, addressed to
Victoria Houed, OUSEA, U.S.
Department of Commerce, 1401
Constitution Avenue NW, Room 4848,
Washington, DC 20230; telephone: (202)
913–1504.
SUPPLEMENTARY INFORMATION: The U.S.
Department of Commerce (Commerce) is
committed to leading the way in
producing and disseminating highquality public data. Commerce’s data
assets enable U.S. scientific discovery,
innovation, and economic growth,
serving as an invaluable asset to the
country. In its mission to publish data
for the American public and achieve its
strategic goal to ‘‘expand opportunity
and discovery through data,’’ Commerce
is dedicated to continuously refining its
processes for creating, curating, and
distributing its data as new technologies
emerge. This Request for Information
(RFI) seeks to understand ways to
improve Commerce’s creation, curation,
and distribution of its open data assets
to facilitate the development and
advancement of AI technologies such as
generative AI.
Commerce, as a premier data
provider, has a long history of adapting
to technological change. In the past 40
years, Commerce has moved data
publication efforts into electronic forms,
and in the past 20 years, that has
included the provision of both data
services and tools to support discovery
and exploration of Commerce’s data. In
the last five years, Title II of the
Foundations for Evidence-Based
Policymaking Act, commonly known as
the OPEN Government Data Act, began
Commerce’s commitment to the
dissemination of open data assets in
E:\FR\FM\17APN1.SGM
17APN1
lotter on DSK11XQN23PROD with NOTICES1
27412
Federal Register / Vol. 89, No. 75 / Wednesday, April 17, 2024 / Notices
machine-readable formats, or ‘‘data in a
format that can be easily processed by
a computer without human intervention
while ensuring no semantic meaning is
lost’’ (44 U.S.C. 3502(18)).
Today, Commerce is facing a new
technological change with the
emergence of AI technologies that
provide improved information and data
access to users. Commerce is
specifically interested in generative AI
(GenAI) applications, which digest
disparate sources of text, images, audio,
video, and other types of information to
produce new content. GenAI and other
AI technologies present both
opportunities and challenges for both
data providers such as Commerce and
data users including other government
entities, industry, academia, and the
American people.
AI has brought transformative changes
to many industries including health,
finance, education, and transportation,
while GenAI has the promise of
democratizing access to data by
enabling the average person to engage
with data in ways that had not
previously been possible. Recent GenAI
tools allow users to input simple
prompts to engage with content
gathered by these tools from a wide
range of sources, including Commerce’s
public data.
The challenge for Commerce, as an
authoritative provider of data, is to
ensure that these new AI intermediaries
can appropriately access its data
without losing the integrity, including
quality, of said data. AI tools require
mass amounts of trustworthy
information to accurately respond to the
needs of their users. As AI applications
become more sophisticated and
ingrained in everyday life, the role of
high-quality data becomes increasingly
critical. Commerce acknowledges, as a
key data producer, that in order for AI
systems to utilize its data for training
and for instant data retrieval, its data
may need to be reconfigured in easily
consumable formats. AI tools are
increasingly used for data analysis and
data access, so Commerce hopes to
ensure that the data these tools consume
is easily accessible and ‘‘machine
understandable,’’ versus just ‘‘machine
readable.’’ Therefore, this RFI explores
how to achieve better data integrity,
accessibility, and quality for emerging
AI technologies.
The uniqueness of emerging
technologies such as GenAI arises from
the fact that the interpretation and use
of data is no longer solely executed by
human experts (e.g., scientists,
engineers, software developers) who
bring their own knowledge and
understanding to working with
VerDate Sep<11>2014
17:10 Apr 16, 2024
Jkt 262001
Commerce’s data. This human
understanding is grounded in shared
disciplinary knowledge and in humanreadable documentation that Commerce
provides with its published data. AI
systems currently lack common
knowledge and the ability to use such
knowledge in their activity. Although
these systems demonstrate fluency and
intelligence, their outputs are often
driven by contextual prediction rather
than higher-order reasoning capabilities.
Recent AI systems are trained on
tremendous amounts of digital content
and generate responses based on the
contextual properties of that content.
However, these systems do not truly
‘‘understand’’ the texts in a meaningful
way. While there is ongoing
improvement, today’s AI systems are
fundamentally limited by their reliance
on extensive, unstructured data stores,
which depend on the underlying data
rather than an ability to reason and
make judgments based on
comprehension. Knowing this,
Commerce seeks to adhere to its
strategic mission to ‘‘expand
opportunity and discovery through
data,’’ by disseminating public data in
AI ready formats while ensuring no
semantic meaning is lost.
To respond to the challenge and
realize the opportunity offered by these
new technologies, it is important that
Commerce enables AI systems to access
and use its public data assets correctly
and responsibly.
This RFI seeks feedback,
recommendations, and suggestions from
industry experts, researchers, civil
society organizations, and the public
regarding Commerce’s creation,
curation, and distribution of data assets
that are specifically designed to
facilitate the development and
advancement of AI technologies such as
GenAI.
Thus far, Commerce has made efforts
to expose its public data through
structured APIs and is developing
enriched metadata standards for
describing its data assets. To date,
Commerce metadata has focused on
enabling discovery of data assets rather
than the use of those data assets by AI
systems, but Commerce sees value in
changing this focus. Commerce seeks to
further understand how it can make its
data assets AI-ready.
In particular, Commerce wishes to
explore the following:
• The use of knowledge graphs for
variable level metadata, allowing
systems to better link human terms to
data elements;
• Embracing standardized ontologies
such as schema.org or NIEM;
PO 00000
Frm 00007
Fmt 4703
Sfmt 4703
• Harmonizing and linking our
internal ontologies and vocabularies
using knowledge graphs grounded in
standardized ontologies;
• Gathering internal and external
written documentation of existing data
products and:
Æ Mining them for terminology to use
in metadata harmonization and linking;
or
Æ Releasing them in raw formats for
the training of AI models;
• Adopting data formats which allow
for rich metadata as well as generating
metadata ‘‘sidecars’’ for more traditional
formats such as CSV or SAS;
• Using open standards for APIs with
the ability to link into knowledge
graphs; and
• Improving guidance and metadata
around appropriate data usage and
licensing for purposes such as research
analytics, text-and-data mining, and AI
system ingestion.
Commerce seeks comment on the
topics discussed above and responses to
the following questions:
Data Dissemination Standards
1. What data dissemination standards
should Commerce adopt to support
human-readable and machineunderstandable public data?
2. What formats, metadata, and
documentation should be prioritized to
facilitate AI applications?
3. How does raw data, such as data
from the sensor networks, differ from
derived data, such as statistical data
from the U.S. Census Bureau, when it
comes to metadata standards?
4. What data licensing practices,
standards, and usage considerations
should Commerce consider to support
broad, equitable, and open access to its
datasets and metadata?
5. What current standards exist or are
under development that Commerce
should consider to clearly signal that its
public data is available for use by AI
systems (or signal any accompanying
conditions or restrictions on said data)?
Data Accessibility and Retrieval
1. How can Commerce’s data assets be
made more accessible and valuable to
the AI community (e.g., improved API
access, web crawlability, etc.)?
2. How can Commerce develop
intuitive and accessible data portals that
facilitate easy navigation and retrieval of
data sets?
3. What users should Commerce
consider when disseminating our AIready data? What atypical users should
Commerce be sure to consider?
4. What measures can be taken to
encourage user-friendly interfaces,
including clear labeling and readable
E:\FR\FM\17APN1.SGM
17APN1
Federal Register / Vol. 89, No. 75 / Wednesday, April 17, 2024 / Notices
formats, for Commerce’s online data
resources?
5. How can Commerce better
understand the needs of users for its
data and the return on its investment in
making its data more AI-ready?
1. How can industry and academic
stakeholders collaborate with the
government to shape the design and
dissemination of AI-ready open data?
2. What are the potential areas of
partnership, and how can industry and
academia contribute to enhancing data
quality, integrity, and usefulness for AI
purposes?
Data Integrity and Quality
1. What are best practices that
industries have employed to enhance
the integrity and accuracy of public data
when used in AI applications? What are
best practices for data verification and
validation? What are best practices for
conducting regular audits and quality
checks of data used in AI applications?
2. How can we collectively address
challenges related to authenticity bias,
privacy, data quality, equity, and ethical
use while maintaining transparency and
accountability?
3. What security protocols can be
developed to mitigate risks of
unauthorized data access and
manipulation?
4. How can Commerce promote
transparency in data sourcing and
processing methods to enhance trust
and reliability? What is the expectation
for reporting the quality of its data and
how can we ensure that information
will be carried through and presented to
the end user?
5. What validation processes can be
established to maintain and verify data
accuracy and consistency?
6. How can Commerce facilitate
comprehensive and transparent data
documentation for replication and
analysis?
lotter on DSK11XQN23PROD with NOTICES1
Data Ethics
1. What steps are needed to establish
clear legal and ethical guidelines for AI
data usage, ensuring privacy rights,
preserving property rights, and focusing
on equitable outcomes?
2. What types of policies could
Commerce implement to identify and
mitigate biases in AI algorithms,
including ensuring diverse data
representation?
3. What are the best protocols for
ethical data collection, processing, and
storage that prioritize data integrity and
accuracy?
Commerce invites your comments and
insights on the above questions, as well
17:10 Apr 16, 2024
Oliver Wise,
Chief Data Officer, Department of Commerce.
[FR Doc. 2024–08168 Filed 4–16–24; 8:45 am]
BILLING CODE P
Partnership Engagement
VerDate Sep<11>2014
as any additional input you deem
relevant.
Jkt 262001
DEPARTMENT OF COMMERCE
Foreign-Trade Zones Board
27413
(entry for U.S. consumption was not
authorized) within FTZ 38 at the facility
of Teijin Carbon Fibers, Inc., located in
Greenwood, South Carolina, as
described in the application and
Federal Register notice.
Dated: April 12, 2024.
Dawn Shackleford,
Executive Director of Trade Agreements
Policy & Negotiations, Alternate Chairman,
Foreign-Trade Zones Board.
[FR Doc. 2024–08189 Filed 4–16–24; 8:45 am]
[Order No. 2160]
BILLING CODE 3510–DS–P
Production Authority Not Approved;
Foreign-Trade Zone 38; Teijin Carbon
Fibers, Inc.; (Carbon Fiber);
Greenwood, South Carolina
Pursuant to its authority under the ForeignTrade Zones Act of June 18, 1934, as
amended (19 U.S.C. 81a–81u), the ForeignTrade Zones Board (the Board) adopts the
following Order:
Whereas, the Foreign-Trade Zones
(FTZ) Act provides for ‘‘. . . the
establishment . . . of foreign-trade
zones in ports of entry of the United
States, to expedite and encourage
foreign commerce, and for other
purposes,’’ and authorizes the FTZ
Board to grant to qualified corporations
the privilege of establishing foreigntrade zones in or adjacent to U.S.
Customs and Border Protection ports of
entry;
Whereas, the South Carolina State
Ports Authority, grantee of FTZ 38, has
requested production authority on
behalf of Teijin Carbon Fibers, Inc.,
within FTZ 38 in Greenwood, South
Carolina (B–52–2020, docketed August
6, 2020);
Whereas, notice inviting public
comment has been given in the Federal
Register (85 FR 49359, August 13, 2020;
85 FR 68557, October 29, 2020; 85 FR
81875, December 17, 2020; 86 FR 7695,
February 1, 2021; 86 FR 10040, February
18, 2021; 86 FR 23672, May 4, 2021; 86
FR 33218, June 24, 2021; 86 FR 38010,
July 19, 2021; 86 FR 48982, September
1, 2021; 88 FR 5853, January 30, 2023;
88 FR 12912, March 1, 2023) and the
application, as amended, has been
processed pursuant to the FTZ Act and
the Board’s regulations; and,
Whereas, the Board adopts the
findings and recommendations of the
examiner’s report, and finds that the
requirements of the FTZ Act and the
Board’s regulations have not been
satisfied;
Now, therefore, the Board hereby does
not approve the application, as
amended, requesting to remove the
restriction requiring that all foreign
status 24,000 tow PAN fiber admitted
for production activity be re-exported
PO 00000
Frm 00008
Fmt 4703
Sfmt 4703
DEPARTMENT OF COMMERCE
Foreign-Trade Zones Board
[S–68–2024]
Foreign-Trade Zone 80; Application for
Subzone; Vitesco Technologies USA,
LLC; Seguin, Texas
An application has been submitted to
the Foreign-Trade Zones (FTZ) Board by
the City of San Antonio, grantee of FTZ
80, requesting subzone status for the
facility of Vitesco Technologies USA,
LLC, located in Seguin, Texas. The
application was submitted pursuant to
the provisions of the Foreign-Trade
Zones Act, as amended (19 U.S.C. 81a–
81u), and the regulations of the FTZ
Board (15 CFR part 400). It was formally
docketed on April 11, 2024.
The proposed subzone (50 acres) is
located at 3740 North Austin Street,
Seguin, Texas. No authorization for
production activity has been requested
at this time. The proposed subzone
would be subject to the existing
activation limit of FTZ 80.
In accordance with the FTZ Board’s
regulations, Kolade Osho of the FTZ
Staff is designated examiner to review
the application and make
recommendations to the Executive
Secretary.
Public comment is invited from
interested parties. Submissions shall be
addressed to the FTZ Board’s Executive
Secretary and sent to: ftz@trade.gov. The
closing period for their receipt is May
28, 2024. Rebuttal comments in
response to material submitted during
the foregoing period may be submitted
during the subsequent 15-day period to
June 11, 2024.
A copy of the application will be
available for public inspection in the
‘‘Online FTZ Information Section’’
section of the FTZ Board’s website,
which is accessible via www.trade.gov/
ftz.
For further information, contact
Kolade Osho at Kolade.Osho@trade.gov.
E:\FR\FM\17APN1.SGM
17APN1
Agencies
[Federal Register Volume 89, Number 75 (Wednesday, April 17, 2024)]
[Notices]
[Pages 27411-27413]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 2024-08168]
=======================================================================
-----------------------------------------------------------------------
DEPARTMENT OF COMMERCE
[Docket No. 240410-0103]
RIN 0690-XD001
AI and Open Government Data Assets Request for Information
ACTION: Notice, request for information.
-----------------------------------------------------------------------
SUMMARY: The U.S. Department of Commerce is committed to advancing
transparency, innovation, and the responsible use and dissemination of
public data assets, including for use by data-driven AI technologies.
To this end, we are pleased to issue this Request for Information (RFI)
to seek valuable insights from industry experts, researchers, civil
society organizations, and other members of the public on the
development of AI-ready open data assets and data dissemination
standards.
DATES: Comments must be received on or before July 16, 2024.
ADDRESSES: All electronic public comments on this action, identified by
Regulations.gov docket number DOC-2024-0007, may be submitted through
the Federal e-Rulemaking Portal at www.regulations.gov. The docket
established for this request for comment can be found at
www.regulations.gov, DOC-2024-0007. Click the ``Comment Now!'' icon,
complete the required fields, and enter or attach your comments.
FOR FURTHER INFORMATION CONTACT: Please direct questions regarding this
Notice to Victoria Houed at [email protected] with ``AI-Ready Open
Data Assets RFI'' in the subject line, or if by mail, addressed to
Victoria Houed, OUSEA, U.S. Department of Commerce, 1401 Constitution
Avenue NW, Room 4848, Washington, DC 20230; telephone: (202) 913-1504.
SUPPLEMENTARY INFORMATION: The U.S. Department of Commerce (Commerce)
is committed to leading the way in producing and disseminating high-
quality public data. Commerce's data assets enable U.S. scientific
discovery, innovation, and economic growth, serving as an invaluable
asset to the country. In its mission to publish data for the American
public and achieve its strategic goal to ``expand opportunity and
discovery through data,'' Commerce is dedicated to continuously
refining its processes for creating, curating, and distributing its
data as new technologies emerge. This Request for Information (RFI)
seeks to understand ways to improve Commerce's creation, curation, and
distribution of its open data assets to facilitate the development and
advancement of AI technologies such as generative AI.
Commerce, as a premier data provider, has a long history of
adapting to technological change. In the past 40 years, Commerce has
moved data publication efforts into electronic forms, and in the past
20 years, that has included the provision of both data services and
tools to support discovery and exploration of Commerce's data. In the
last five years, Title II of the Foundations for Evidence-Based
Policymaking Act, commonly known as the OPEN Government Data Act, began
Commerce's commitment to the dissemination of open data assets in
[[Page 27412]]
machine-readable formats, or ``data in a format that can be easily
processed by a computer without human intervention while ensuring no
semantic meaning is lost'' (44 U.S.C. 3502(18)).
Today, Commerce is facing a new technological change with the
emergence of AI technologies that provide improved information and data
access to users. Commerce is specifically interested in generative AI
(GenAI) applications, which digest disparate sources of text, images,
audio, video, and other types of information to produce new content.
GenAI and other AI technologies present both opportunities and
challenges for both data providers such as Commerce and data users
including other government entities, industry, academia, and the
American people.
AI has brought transformative changes to many industries including
health, finance, education, and transportation, while GenAI has the
promise of democratizing access to data by enabling the average person
to engage with data in ways that had not previously been possible.
Recent GenAI tools allow users to input simple prompts to engage with
content gathered by these tools from a wide range of sources, including
Commerce's public data.
The challenge for Commerce, as an authoritative provider of data,
is to ensure that these new AI intermediaries can appropriately access
its data without losing the integrity, including quality, of said data.
AI tools require mass amounts of trustworthy information to accurately
respond to the needs of their users. As AI applications become more
sophisticated and ingrained in everyday life, the role of high-quality
data becomes increasingly critical. Commerce acknowledges, as a key
data producer, that in order for AI systems to utilize its data for
training and for instant data retrieval, its data may need to be
reconfigured in easily consumable formats. AI tools are increasingly
used for data analysis and data access, so Commerce hopes to ensure
that the data these tools consume is easily accessible and ``machine
understandable,'' versus just ``machine readable.'' Therefore, this RFI
explores how to achieve better data integrity, accessibility, and
quality for emerging AI technologies.
The uniqueness of emerging technologies such as GenAI arises from
the fact that the interpretation and use of data is no longer solely
executed by human experts (e.g., scientists, engineers, software
developers) who bring their own knowledge and understanding to working
with Commerce's data. This human understanding is grounded in shared
disciplinary knowledge and in human-readable documentation that
Commerce provides with its published data. AI systems currently lack
common knowledge and the ability to use such knowledge in their
activity. Although these systems demonstrate fluency and intelligence,
their outputs are often driven by contextual prediction rather than
higher-order reasoning capabilities. Recent AI systems are trained on
tremendous amounts of digital content and generate responses based on
the contextual properties of that content. However, these systems do
not truly ``understand'' the texts in a meaningful way. While there is
ongoing improvement, today's AI systems are fundamentally limited by
their reliance on extensive, unstructured data stores, which depend on
the underlying data rather than an ability to reason and make judgments
based on comprehension. Knowing this, Commerce seeks to adhere to its
strategic mission to ``expand opportunity and discovery through data,''
by disseminating public data in AI ready formats while ensuring no
semantic meaning is lost.
To respond to the challenge and realize the opportunity offered by
these new technologies, it is important that Commerce enables AI
systems to access and use its public data assets correctly and
responsibly.
This RFI seeks feedback, recommendations, and suggestions from
industry experts, researchers, civil society organizations, and the
public regarding Commerce's creation, curation, and distribution of
data assets that are specifically designed to facilitate the
development and advancement of AI technologies such as GenAI.
Thus far, Commerce has made efforts to expose its public data
through structured APIs and is developing enriched metadata standards
for describing its data assets. To date, Commerce metadata has focused
on enabling discovery of data assets rather than the use of those data
assets by AI systems, but Commerce sees value in changing this focus.
Commerce seeks to further understand how it can make its data assets
AI-ready.
In particular, Commerce wishes to explore the following:
The use of knowledge graphs for variable level metadata,
allowing systems to better link human terms to data elements;
Embracing standardized ontologies such as schema.org or
NIEM;
Harmonizing and linking our internal ontologies and
vocabularies using knowledge graphs grounded in standardized
ontologies;
Gathering internal and external written documentation of
existing data products and:
[cir] Mining them for terminology to use in metadata harmonization
and linking; or
[cir] Releasing them in raw formats for the training of AI models;
Adopting data formats which allow for rich metadata as
well as generating metadata ``sidecars'' for more traditional formats
such as CSV or SAS;
Using open standards for APIs with the ability to link
into knowledge graphs; and
Improving guidance and metadata around appropriate data
usage and licensing for purposes such as research analytics, text-and-
data mining, and AI system ingestion.
Commerce seeks comment on the topics discussed above and responses
to the following questions:
Data Dissemination Standards
1. What data dissemination standards should Commerce adopt to
support human-readable and machine-understandable public data?
2. What formats, metadata, and documentation should be prioritized
to facilitate AI applications?
3. How does raw data, such as data from the sensor networks, differ
from derived data, such as statistical data from the U.S. Census
Bureau, when it comes to metadata standards?
4. What data licensing practices, standards, and usage
considerations should Commerce consider to support broad, equitable,
and open access to its datasets and metadata?
5. What current standards exist or are under development that
Commerce should consider to clearly signal that its public data is
available for use by AI systems (or signal any accompanying conditions
or restrictions on said data)?
Data Accessibility and Retrieval
1. How can Commerce's data assets be made more accessible and
valuable to the AI community (e.g., improved API access, web
crawlability, etc.)?
2. How can Commerce develop intuitive and accessible data portals
that facilitate easy navigation and retrieval of data sets?
3. What users should Commerce consider when disseminating our AI-
ready data? What atypical users should Commerce be sure to consider?
4. What measures can be taken to encourage user-friendly
interfaces, including clear labeling and readable
[[Page 27413]]
formats, for Commerce's online data resources?
5. How can Commerce better understand the needs of users for its
data and the return on its investment in making its data more AI-ready?
Partnership Engagement
1. How can industry and academic stakeholders collaborate with the
government to shape the design and dissemination of AI-ready open data?
2. What are the potential areas of partnership, and how can
industry and academia contribute to enhancing data quality, integrity,
and usefulness for AI purposes?
Data Integrity and Quality
1. What are best practices that industries have employed to enhance
the integrity and accuracy of public data when used in AI applications?
What are best practices for data verification and validation? What are
best practices for conducting regular audits and quality checks of data
used in AI applications?
2. How can we collectively address challenges related to
authenticity bias, privacy, data quality, equity, and ethical use while
maintaining transparency and accountability?
3. What security protocols can be developed to mitigate risks of
unauthorized data access and manipulation?
4. How can Commerce promote transparency in data sourcing and
processing methods to enhance trust and reliability? What is the
expectation for reporting the quality of its data and how can we ensure
that information will be carried through and presented to the end user?
5. What validation processes can be established to maintain and
verify data accuracy and consistency?
6. How can Commerce facilitate comprehensive and transparent data
documentation for replication and analysis?
Data Ethics
1. What steps are needed to establish clear legal and ethical
guidelines for AI data usage, ensuring privacy rights, preserving
property rights, and focusing on equitable outcomes?
2. What types of policies could Commerce implement to identify and
mitigate biases in AI algorithms, including ensuring diverse data
representation?
3. What are the best protocols for ethical data collection,
processing, and storage that prioritize data integrity and accuracy?
Commerce invites your comments and insights on the above questions,
as well as any additional input you deem relevant.
Oliver Wise,
Chief Data Officer, Department of Commerce.
[FR Doc. 2024-08168 Filed 4-16-24; 8:45 am]
BILLING CODE P