Jump to content

Importing a User Tools doe not format the same as the pdf file


GardenGuy42

Recommended Posts

When importing a pdf file written in Word 2010 and converted to a .pdf the formatting gets changed. For example:

 

1. Italiced words get changed back to normal (not all, it appears to be mostly single words).

 

2. Paragraph alignment randomly gets changed from Left to Center or Right alignment. (Sometimes a single paragraph, but can also be several in a row).

 

3. Paragraph formatting randomly gets changed to indented or hanging. (Sometimes a single paragraph, but can also be several in a row).

 

4. Sometimes entire paragraphs get reordered. As an example:

 

A Word about Words

(A). The expression most often translated…

(B). Both words are used in reference to Christ as Lord…

Got changed to:

A Word about Words (B). Both words are used in reference to Christ as Lord…

(A). The expression most often translated…

 

5. The Tool Editor does not always update correctly: The Worksheet displayed Both and the Tool Editor showed Both. Deleting Both and replacing it, then Updating the file, the Worksheet still displayed Both. All other updates displayed correctly. This also is a random phenomenon.

 

Are these known bugs? Is there a fix for these issues?

 

Accordance 13 is running on a Toshiba Satellite L75D in Windows 8.1. The word processor being used is Microsoft Word 2010.

 

 

  • Like 1
Link to comment
Share on other sites

A lot of this depends on the nature of your pdf, and the program used to generate the pdf.  Can you send me the pdf, so I can try an import over here?

Link to comment
Share on other sites

I'm running OS 10.15.4, Accord. 13.0.4

 

I  share the FRUSTRATION!

 

I imported a PDF file generated using Mellel 4.0 .

I got it imported as a PDF User Tool,

HOWEVER......it imported pretty much as Plain Text.

- formatting disappeared

- underlining gone

- text moved to wrong locations

- etc.

 

I've spent 3 days attempting to re-format the text -

using the User Tool Edit Window (which I reported on April 1.

The Edit Tool for User Tools DOES NOT WORK RIGHT.

There is no Find And Replace available.

There is no way to change the text size of the window without changing the Resolution of my monitor.

There is no scroll bar to show where you are in the document.

And it's crashed Accordance twice in 3 days.

 

I've started over three times now -

and I am about ready to ABANDON ACCORDANCE!

 

When I Update/Save my work - IT CORRUPTS IT AGAIN!

Fonts are changed from Bold Red Yehudit to non-bold Orange, Italics

(there is no italics in Hebrew)

Underlining disappears,

text is relocated, etc.

All the work I just did needs to be reworked AGAIN.

 

This is EXASPERATING!

 

Here's a link to my PDF file:

alephtau-bible-complete-v10.pdf

 

I originally tried to code this and import it as a User Bible.

That proved futile!

One specific error continued to pop up -

in a phrase that repeats itself 3541 times in the text.

I researched ALL of them - and could not find a reportedly missing "opening code".

 

So I bought the upgrade to Accordance 13 -

thinking I'd try the PDF import route.

This is proving as frustrating as trying to Code my User Bible -

with NO coding experience!

 

If there is someone on this forum who would like to TRY to Code this text

- just the text itself, and not the notes (blue text)

I'd be VERY GRATEFUL for the help getting it Coded.

OR:

If someone would like to try to re-format the PDF File

I would welcome that help.

 

If anyone has suggestions PLEASE let me see them!

Thank you for your kind assistance!

 

Obed Benyah

  • Like 1
Link to comment
Share on other sites

Hi Joel,

 

You mentioned “A lot of this depends on the nature of your pdf”, so I tested 4 different programs to produce a pdf file to import into “User Tools”. All 4 of the pdf files had problems with italicized words, mostly single words. Paragraphs with several continuous italicized words imported correctly, however.  The 4 programs, listed best to worst, were:

 

1. “Libre Office” and “pdfNova” both changed 1 or italicized words back to normal. (Sometimes adjacent words were flipped to italic.)

 

2. “Open Office” Same as above, and rearrange the order of 2 paragraphs.

 

3. “Microsoft Word 2010 & 16” Same as 1 and 2 above. And

 

(A) Paragraph alignment randomly got changed from Left to Center or Right alignment. (Sometimes a single paragraph, but can also be several in a row).

 

(B) Paragraph formatting randomly got changed to indented or hanging. (Sometimes a single paragraph, but can also be several in a row).

 

The problem with 1 or 2 italicized words getting flipped in “User Tool” almost appears to be in Accordance, because every method I have tried does the same thing with them in the same way and in the same places. The issues noted in 2 a 3 above appear to be more source related. Can you suggest anything that will convert a Microsoft Word file into a pdf file to meet the Accordance requirement?

 

Guy

Link to comment
Share on other sites

GardenGuy:  Take a look at this import, see if it works better for you:

 

The Gospel According to Jesus C.acc7.zip

 

I imported it on my Mac, which has a more reliable PDF parser than Windows does.  We are working on improving the Windows parser, but it is a major task.  It does an "OK" job, but far worse than the Mac one.

 

One thing I'd like to point out, though, is PDF is fundamentally a very difficult format.  There is rarely any context of "this word follows this word, here's a new line, etc.".  Simply, a word is positioned on the page.  So, Accordance has to guess where to place line breaks.  All of this being said, a cursory look through the import seems to be pretty solid, but there are a few places where justification or missing line breaks, but that can be edited and cleaned up fairly easily.

  • Like 1
Link to comment
Share on other sites

Obed:

 

Let me reply to many of your comments directly:

 

 

I imported a PDF file generated using Mellel 4.0 .

I got it imported as a PDF User Tool,

HOWEVER......it imported pretty much as Plain Text.

- formatting disappeared

- underlining gone

- text moved to wrong locations

- etc.

I imported the provided pdf file, and a lot of the informaion is there.  One of the big problems with your user tool is your nature of line breaks.  Accordance can't tell if you deliberately want to have a break at the end of each line, or if it is a column that is wrapping down next to an image.  While I understand your frustration to have to re-add all of the line breaks, another user would have the same frustration having to delete all of the extra line breaks.

 

However, and this is important, you note you have the source file already in Mellel.  Why not just copy and paste your data in, then?  This way, it avoids going through the destructive process of becoming a pdf.

 

 

 

There is no Find And Replace available.

Yes there is, Search Menu -> Find/Replace (Cmd-F)

 

 

 

There is no way to change the text size of the window without changing the Resolution of my monitor.

This is being addressed in two ways.  First, as a bug-fix for 13.0.5, we are scaling up the size of content imported from PDF.  There was a ~33% adjustment we were missing, making everything a bit smaller than the source file.  Second, as a new feature in the next 'bigger' release of Accordance (don't worry, still free upgrade for v13 users), there will be a batch option to change all of the fonts, sizes, styles, etc. in a User Note or User Tool.

 

 

 

There is no scroll bar to show where you are in the document.

Ah, an interesting bug! Please report issues like this as you see them, rather than collecting them in a long post.  My scrollbar was also hidden, but as soon as I resized the window, the scrollbar reappeared.  Does this work for you?

 

 

 

And it's crashed Accordance twice in 3 days.

If you are getting crashes, please report them, along with the crash log, so we can fix the bugs ASAP.

 

 

 


When I Update/Save my work - IT CORRUPTS IT AGAIN!

Fonts are changed from Bold Red Yehudit to non-bold Orange, Italics

(there is no italics in Hebrew)

Underlining disappears,

text is relocated, etc.

All the work I just did needs to be reworked AGAIN.

 

This is EXASPERATING!

If you have a reproducible case, this would be great for us to see and fix.  I'm looking for a case like "open this user tool, make this change, save, and now this other thing has incorrectly changed."  That we can try on our end.

 

 

 

 

I originally tried to code this and import it as a User Bible.

This could be a User Bible, but you would need to remove all of the commentary (blue text), the hebrew text in red, and adjust the formatting a bit.

 

 

 

One specific error continued to pop up -

in a phrase that repeats itself 3541 times in the text.

I researched ALL of them - and could not find a reportedly missing "opening code".

What was the error?  When did the error occur, when importing it as a user bible?  Did you read the Help and format your source file to match the requirements for a User Bible?

 

 

In general, you are importing a very large PDF with less common formatting.  The import is working, but due to how you format your text, it is making some decisions that you disagree with.  Why not just copy and paste straight from Mellel, skipping the PDF entirely?

Link to comment
Share on other sites

Joel,

 

Thank you for your reply this morning.

I'll attempt to respond with as much information as I can.

 

I imported the provided pdf file, and a lot of the information is there.  One of the big problems with your user tool is your nature of line breaks.  Accordance can't tell if you deliberately want to have a break at the end of each line, or if it is a column that is wrapping down next to an image.  While I understand your frustration to have to re-add all of the line breaks, another user would have the same frustration having to delete all of the extra line breaks.

 

However, and this is important, you note you have the source file already in Mellel.  Why not just copy and paste your data in, then?  This way, it avoids going through the destructive process of becoming a pdf.

 

First, it's not just the line breaks that are an issue.

That may have something to do with the fact that I get a shift from Left Justification to Center Justification randomly. It's not consistent throughout the document.

 

I was not aware that I could copy and paste from Mellel into Accordance.

Is that for any User Tool?

Will that work for a User Bible import?

What do I open to copy and paste it in?

 

Also, are we talking about the PDF import or the User Bible Import coding in regard to line breaks?

In coding the text alone, not the notes, for a User Bible Import the line breaks are coded directly.

In the PDF import there's no way to set a line break that I'm aware of.

 

Can I just import the Mellel file as a user tool - without converting it to a PDF.

I thought that was the point of providing a PDF Import for User Tools....

 

RE: User Bible Import -

Yes, I did read the help file on User Bible Imports.

I followed those formats as much as I know how.

However, there are things I don't know about how HTML coding works.

I don't know what "interrupts" an "opening code" ..... and a "closing code".

Does punctuation in a sentence interrupt them?

Do line breaks interrupt them?

Do Italics interrupt them?

How do I know why a code fails when the coding appears to be done correctly,

yet still throws an error code?

And the error code sometimes shows a verse reference for the error,

but much of the time there is NO VERSE REFERENCE SHOWN.

In trying to code this as a User Bible - with just the text and not the notes (blue),

when I get an error involving a phrase that repeats itself 3541 times

I have no clue where to start looking for the problem.

Then - having looked at ALL 3541 references -

and finding NO PLACE where there is a missing code,

I'm left totally lost about what to do next???????

 

This text was coded previously by a very helpful person who monitors this Forum.

I had it imported into Accordance as a User bible.

I simply took what he had done and made a number of corrections to the text  as I refined it.

Using his coding as my guideline, I then attempted to re-import the file.

I got part of it to import and got an error message coded to a verse.

I corrected that and tried to import it again.

I then got the repeat phrase as an error stating that I had "no 'opening code' for a given 'closing code'.

It happened to be an underline code.

I've been through the coding multiple times and cannot find ANY missing opening underline code.

Yet the error message continues to repeat itself.

 

Further, why do I need to remove the Hebrew text in red in order to import this as a User Bible?

That makes NO sense to me at all.

Can Accordance not handle Hebrew text in an imported User Bible?

That seems rather absurd.

Is it the color that's a problem?

 

I'll report future crashes.

I did send a crash log report to Apple when it requested it.

I'm not sure I can duplicate the crashes......

I'm also not sure I can duplicate the "Update" alterations, but will attempt to do so as I try to work with this.

I can't send them at this point since i've wiped them and re-formatted some of the things already.

 

When might we anticipate 13.0.5 - and the next  major update for 13.1?

 

Thank you for your help on this!

 

 

 

 

 

 

 

 

 

 

 

 

 

Yes there is, Search Menu -> Find/Replace (Cmd-F)

 

Thank you for this. 

I had looked for this in the Drop Down Edit Menu.

It's not listed there.

This will prove very helpful.

 

Regarding the scroll bar:

This morning when I opened a new User Tool Edit window the scroll bar, with a moving indicator, shows up just fine.

It was not doing this previously.

There was a "scroll bar" area showing, but there was no indicator (scrolling block) of where I was in the document.

I'm not sure what's going on there.

Further, I'm never sure what is a ""bug" in the program and what is not.

All I know is there's a problem I don't know how to address.

As I've mentioned, I'm not a computer "geek".

I don't have a degree in computer science.

Link to comment
Share on other sites

"GardenGuy:  Take a look at this import, see if it works better for you:

 

zip.gif  The Gospel According to Jesus C.acc7.zip   54.09KB   0 downloads"

 

 

 

The file you sent did an excellent job with italics, getting them all correctly, but had problems with paragraphs formatting: Some were split, some were appended and some were scrambled.

  • Like 1
Link to comment
Share on other sites

Joel,

 

I was just working with a Copy/Paste insertion into the User Tool Edit Window.

I copied only a small portion of text from my PDF and pasted it into the edit window.

After making sure the formatting was what i wanted I clicked on "Update".

I LOST ALL of the imported text - from both the PDF Import file

AND from the User Tool Edit Window.

So much for Copy and Paste........

 

Side Note:

Both program crashes involved attempts to Update the imported PDF file

after making a number of changes to the formatting to correct it.

I re-formatted Genesis 1.1 thru Gen 8.21 (the limit on the import size).

As soon as I clicked the Update link the program crashed.

If I work with smaller amounts of text (20-30 lines) it does not crash.

 

I'm still wondering when we will see 13.0.5 or 13.1.

What can we expect?

 

Obed Benyah

Link to comment
Share on other sites

Obed, please post your crash reports here, especially if you can get a reproducible case.  Checking out crash logs, I cannot find your reports at all.

 

For the Copy and Paste, i recommend you copy and paste *from* Mellel, not the pdf, and I recommend you paste into a fresh new tool.  Try it in small cases, see how it works.  Again, if you have reproducible cases, we can fix them.  I just tried editing my import of your pdf, and it did not crash.

Link to comment
Share on other sites

Joel,

 

Thanks for your update.

How much text did you attempt to update?

I was working with the first 8 chapters of Genesis - through Gen. 8.21 - when the crashes occurred.

I haven't tried to reproduce the crash at this point so I don't have a way to report one.

 

Only one of the crash logs was sent to Apple.

I don't know where to find them on my system,

so I don't know if there's one here or not.

 

I've spent the last 45 minutes setting up a New User Tool for a trial run,

then copied and pasted from Mellel directly into the New User Tool.

 

Here's what I encountered:

1. I imported the first 3 pages of text, so not a large section.

2. When I imported it the formatting was significantly altered immediately,

even though it was simply copied and pasted

- line spacings were off

- underlining was gone

- bold text looked to be bold - but was in fact not bold, as I'll try to explain below.

- text size, once again, was virtually unreadable because of the reduced size issues.

3. I reset the resolution on my monitor so I could read the Edit Window.

4. I then made the corrections and hit "Update".

- The update did NOT match my inputs in all instances.

- the text was showing two different sizes

- line spacings were again altered

5. I went back into the Edit Window.

- I highlighted the text,

changed the font size from 9 point Arial to 10 point Arial -

and then restored it to 9 point Arial (the same as the original Mellel material)

6. This reset the line spacing and text sizing so it was now uniform.

7. Then I had to re-format the Hebrew characters to enlarge them so they were readable.

- I set them to 12 point Yehudit.

(I do this same thing in Mellel to get the Yehudit characters to match the size of the Arial font.)

8. I also had to restore the Bold text to several items since, specifically, the bold red text was no longer bold,

even though it appeared to be bold in the Edit Window.

9. I hit "Update" again.

10. Having done all of that I did end up with a correct representation of the text.

I did not get a crash when doing so.

However, this is a very small portion of text, so I really didn't expect that to be an issue.

 

At three pages in 45 minutes - with all the needed corrections -

and trying to move back and forth from Mellel to Accordance and the Edit Window

it is apparent that this is an extremely intensive and time-consuming way to have to do this.

Given the reality that I've got 2605 pages to re-import and re-format 

I'm not sure it's worth all the effort.

 

I'm also not sure what happens when I get to page 100 or page 1027, etc........

Will it crash?

Will I lose everything I've done, again?

Will I have to go back to page 1 and start over each time I update the file?

I can see these as very real possibilities, given what I've already been through.

 

It's difficult for me to manage to get usable screen shots of these issues.

If I could do that more easily I'd send them to you.

I'm just not that adept at managing screen shots.

 

Thanks for the help......

 

Obed

Link to comment
Share on other sites

  • 1 month later...

When importing a pdf file written in Word 2010 and converted to a .pdf the formatting gets changed. For example: 1. Italiced words get changed back to normal (not all, it appears to be mostly single words). 2. Paragraph alignment randomly gets changed from Left to Center or Right alignment. (Sometimes a single paragraph, but can also be several in a row). 3. Paragraph formatting randomly gets changed to indented or hanging. (Sometimes a single paragraph, but can also be several in a row). 4. Sometimes entire paragraphs get reordered. As an example: A Word about Words (A). The expression most often translated… ( B). Both words are used in reference to Christ as Lord… Got changed to: A Word about Words ( B). Both words are used in reference to Christ as Lord… (A). The expression most often translated… 5. The Tool Editor does not always update correctly: The Worksheet displayed Both and the Tool Editor showed Both. Deleting Both and replacing it, then Updating the file, the Worksheet still displayed Both. All other updates displayed correctly. This also is a random phenomenon. Are these known bugs? Is there a fix for these issues? Accordance 13 is running on a Toshiba Satellite L75D in Windows 8.1. The word processor being used is Microsoft Word 2010.

 

I prefer to import files from Word to export it as HTML filtered, after Word 2016. But often the issues you describe are the same. 

Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...