Expand my Community achievements bar.

Bound vs bbox

Avatar

Level 1

I am trying to convert PDF to HTML. The idea is to use PDF extract API to extract layout plus styling information - which generates a Json output. I am trying to write a simple python program to parse through the Json and generate corresponding HTML element. 

For eg. <Figure> json element can be converted to <img> HTML element. 

The mapping between Json tag to HTML tag is straight forwards. I am confused with multiple  multiple bound attibutes. For instance

"Bounds": [
87.047607421875,
2307.354721069336,
158.33660888671875,
2371.139617919922
],
"ClipBounds": [
87.047607421875,
2307.354721069336,
158.33660888671875,
2371.139617919922
],
"Page": 0,
"Path": "//Document/Sect/Figure",
"attributes": {
"BBox": [
200.45099999999366,
2591.719999999972,
271.7289999999921,
2655.5599999999395
],
"Placement": "Block"
},
"filePaths": [
"figures/fileoutpart1.png"
],

 

Which coordinate should be used to decide placement of corresponding <IMG> tag in HTML? Bound or Clipbound or Bbox

 

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

0 Replies