Bug: 357762254

Clone this repo:
  1. 50758b5 Add janitors to the OWNERS file by Sadaf Ebrahimi · 5 weeks ago main master
  2. a8c6872 Replace JSpecify stub with the real thing. am: bb7e475beb by Krzysztof Kosiński · 3 months ago
  3. bb7e475 Replace JSpecify stub with the real thing. by Krzysztof Kosiński · 3 months ago
  4. e752ff1 Merge remote-tracking branch 'origin/third-party-review' by Frank Piva · 8 months ago android15-automotiveos-dev android15-platform-release android15-prebuilt-test android15-qpr1-release android15-qpr1-s3-release android15-qpr1-s4-release android15-qpr1-s5-release android15-release android15-s1-release android15-security-release android15-tests-dev android15-tests-release aml_adb_351010000 aml_ads_351017080 aml_ads_351121120 aml_art_350913340 aml_art_351011240 aml_art_351011340 aml_art_351110180 aml_ase_351010000 aml_ase_351112060 aml_ase_351114000 aml_cbr_350910020 aml_cbr_351011020 aml_cbr_351111000 aml_cfg_351010000 aml_con_351010000 aml_con_351110000 aml_doc_350915120 aml_doc_351012120 aml_doc_351113060 aml_ext_350912020 aml_ext_351122080 aml_hef_350921160 aml_hef_351016140 aml_hef_351120040 aml_ips_351010000 aml_ips_351111040 aml_med_350914000 aml_med_351010060 aml_mpr_350914160 aml_mpr_351013100 aml_mpr_351013160 aml_mpr_351113060 aml_mpr_351113100 aml_net_350911020 aml_net_351010000 aml_net_351010020 aml_net_351111100 aml_net_351111140 aml_odp_351020000 aml_odp_351121040 aml_per_350910080 aml_per_351014000 aml_per_351112280 aml_per_351112300 aml_res_351011000 aml_res_351111020 aml_rkp_350910000 aml_rkp_351011000 aml_sch_351010000 aml_sdk_350910000 aml_sdk_351110000 aml_sta_350911020 aml_sta_351110040 aml_tet_350911120 aml_tet_351010220 aml_tet_351110060 aml_tz6_351010000 aml_uwb_350911040 aml_uwb_351011040 aml_wif_350912040 aml_wif_351010040 aml_wif_351110060 android-15.0.0_r1 android-15.0.0_r10 android-15.0.0_r11 android-15.0.0_r12 android-15.0.0_r13 android-15.0.0_r2 android-15.0.0_r3 android-15.0.0_r4 android-15.0.0_r5 android-15.0.0_r6 android-15.0.0_r7 android-15.0.0_r8 android-15.0.0_r9 android-cts-15.0_r1 android-cts-15.0_r2 android-platform-15.0.0_r1 android-platform-15.0.0_r2 android-platform-15.0.0_r3 android-platform-15.0.0_r4 android-security-15.0.0_r1 android-security-15.0.0_r2 android-security-15.0.0_r3 android-security-15.0.0_r4 android-vts-15.0_r1 android-vts-15.0_r2 frc_350820260 frc_350820420 frc_350820440 frc_350820660 frc_350820860 frc_350820960 frc_350822020
  5. 81c2990 Add metadata files for jsoup by Yara Hassan · 8 months ago

jsoup: Java HTML Parser

jsoup is a Java library that makes it easy to work with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, and xpath selectors.

jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers.

  • scrape and parse HTML from a URL, file, or string
  • find and extract data, using DOM traversal or CSS selectors
  • manipulate the HTML elements, attributes, and text
  • clean user-submitted content against a safe-list, to prevent XSS attacks
  • output tidy HTML

jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.

See jsoup.org for downloads and the full API documentation.

Build Status

Example

Fetch the Wikipedia homepage, parse it to a DOM, and select the headlines from the In the News section into a list of Elements:

Document doc = Jsoup.connect("https://en.wikipedia.org/").get();
log(doc.title());
Elements newsHeadlines = doc.select("#mp-itn b a");
for (Element headline : newsHeadlines) {
  log("%s\n\t%s", 
    headline.attr("title"), headline.absUrl("href"));
}

Online sample, full source.

Open source

jsoup is an open source project distributed under the liberal MIT license. The source code is available on GitHub.

Getting started

  1. Download the latest jsoup jar (or add it to your Maven/Gradle build)
  2. Read the cookbook
  3. Enjoy!

Android support

When used in Android projects, core library desugaring with the NIO specification should be enabled to support Java 8+ features.

Development and support

If you have any questions on how to use jsoup, or have ideas for future development, please get in touch via jsoup Discussions.

If you find any issues, please file a bug after checking for duplicates.

The colophon talks about the history of and tools used to build jsoup.

Status

jsoup is in general, stable release.