File size limit?

Asked by COKEDUDE

I've noticed that for files longer than about 8000 lines that gedit has problems opening the file. Was gedit not designed for long files or is there another problem? The same thing also happens on complicated html files. So I hope there is a way to fix this.

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu gedit Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
mycae (mycae) said :
#1

You are probably looking at files with invalid characters. Gedit may refuse to open them. I regularly use gedit on files with more than 100k lines in them.

https://bugs.launchpad.net/gedit/+bug/75151

You may want to try another text editor :( Try kwrite - however this will pull in most of KDE, so its a large download.

Revision history for this message
COKEDUDE (cokedude) said :
#2

Do you do a lot of programming? What do the files contain?

Both geany and kate are able to open these files with no problems.

Revision history for this message
mycae (mycae) said :
#3

>Do you do a lot of programming? What do the files contain?
I do a fair bit.

Text processing is quite complicated, mosty due to historical and political reasons. Originally the USA-ians used ASCII, which allowed 7 bits per char. Then there were variations that used 8 bits, because no-one in europe could type the characters in their native language, nor could anyone from a non-western language write their symbols. Now there are myriad of different text encoding methods, at least for the next 20 odd years until we all give up on 8 bit ASCII (which wont die, cause its easy to program with). Hoepfully, eventually one of the UTF variants will be accepted by all (https://secure.wikimedia.org/wikipedia/en/wiki/Unicode)

So your "text" files are simply binary files that you have a special method of interpreting to make them appear as recognisable (to you) symbols on your screen

Gedit tries to detect each files encoding method; however in these specifications, certain bit combinations are not mapped to particular shapes "glyphs", but are marked as reserved or disallowed.

If gedit misidentifies the file encoding, and thus misinterprets the bit sequence, or if the file contains characters that are disallowed by the specification (for example all-zero bits are usually reserved for denoting the end of strings in memory) then it just gives up.

Actually it decodes what it can fine, but refuses to show you that -- the programmers seem to think this is a good idea, to "protect" the user from ugly symbols, perhaps, I am unsure.

Kwrite. for example, simply replaces them with a symbol to denote that there is a bit sequence here, but it doesn't know what it is on about, so you might see, for example, a black diamond with a question mark in it. I consider that a more rational behaviour in a text editor.

Feel free to CC yourself in on that bug -- maybe with enough people (politely) on CC, the programmers will realise that this is not a desirable behaviour.

Can you help with this problem?

Provide an answer of your own, or ask COKEDUDE for more information if necessary.

To post a message you must log in.