AI powered pipeline that transforms academic PDF content into semantic markup that can be represented more inclusively