Difference between revisions of "Data Type"

From FileZilla Wiki
Jump to navigationJump to search
(fXEoYcwHgjGQ)
m (Reverted edits by 93.108.244.166 (talk) to last revision by CodeSquid)
(22 intermediate revisions by 13 users not shown)
Line 1: Line 1:
Not sure if I'll be able to make it to the meeting, but here is a thohgut: The main reason why OSX builds are slower is that in practice, we're doing two builds. I see two possible hacks around this, one of which I'm not certain of the impact:- Build universal binaries in one pass. Apple's gcc allows to build both i386 and x86-64 binaries with one command line. I don't know if that's actually faster than doing one pass in i386 and another one in x86-64- Build i386 and x86-64 binaries separately, in parallel, on different machines, and aggregate the result in a universal package when they are both finished.
+
Files can be transferred between an FTP client and server in different ways. The FTP specification ([http://filezilla-project.org/specs/rfc0959.txt <nowiki>RFC 959</nowiki>]) calls them "data type", but they are commonly referred to as "transfer mode", even though this is not correct.
 +
 
 +
The different data types are:
 +
*ASCII
 +
*binary (called "image" in the specification)
 +
*EBCDIC
 +
*local
 +
 
 +
But most of the time, however, only ASCII and binary types are used or even implemented.
 +
 
 +
ASCII type is used to transfer text files. The problem with text files is that different platforms have different kinds of line endings. Microsoft Windows for example uses a CR+LF pair (carriage return and line feed), while Unix(-like) systems, including Linux and MacOS X, only use LF and traditional MacOS systems (MacOS 9 or older) only use CR. The purpose of ASCII type is to ensure that line endings are properly changed to what is right on the platform. According to the FTP specification, ASCII files are always transferred using a CR+LF pair as line ending.
 +
 
 +
So in case the file is transferred from the client to the server, the client has to make sure CR+LF is used. Therefore it has to add nothing (on Microsoft Windows), add CR (on Unix) or add LF (on legacy MacOS) to each line ending. The server then adjusts the line ending again to what is used on the platform the server runs at. If it is Microsoft Windows, nothing has to be removed, while on Unix the superfluous CR is removed and on legacy MacOS the unneeded LF.
 +
 
 +
The same happens when a file is downloaded from the server to the client: the server makes sure the line endings are CR+LF when sending the file and the client then strips away whatever is not needed as line ending on its platform.
 +
 
 +
Because the file is changed if client and server are not running on the same kind of platform, this data type cannot be used for files with arbitrary characters, so called binary files, like images and videos. If it is used anyway, the binary files most likely are corrupted and won't work as expected anymore.
 +
 
 +
Compared to ASCII type, binary type is the easier one: the file is just transferred as-is, and no line ending translation is done.
 +
 
 +
So when you are not sure what to use, always go for binary type. Nowadays, nearly all (good) text editors can handle the three possible line endings, and other textual files like the ones of scripting languages such as Perl or PHP, as well as XML files (nearly) always work with any line ending as well.
 +
 
 +
== Example ==
 +
 
 +
Client system: Windows (CRLF line endings).
 +
 
 +
Server system: Some Linux distribution (LF line endings).
 +
 
 +
If you upload a text file with 200 lines and a total size of 5768 bytes, it will have a size of 5568 bytes on the server.
 +
 
 +
== Note ==
 +
 
 +
FileZilla does not analyse files uploaded as ASCII in any way. So if you have mixed line endings, somewhat "unexpected" things can happen. The native line ending for Windows is CR+LF. As this is what the FTP server expects when transferring files in ASCII, FileZilla on Windows '''does not''' apply any line ending translation at all. Now, imagine there is a text file with mixed Windows (CR+LF) and Unix (LF) line endings. Uploading that file from a Windows-based system to a Unix-based system will result in all CR+LF translated to LF only. Downloading that file again will make the FTP server convert all LF to CR+LF while sending it to FileZilla. As a result, all LF effectively are converted to CR+LF.
 +
 
 +
Another example is a text file with mixed line endings. FileZilla on Windows uploads that file to an FTP server running on Windows - no line ending conversion is done at all. Some text editors transparently handle mixed line endings so in such a text editor, the text file looks fine. However, other programs do not handle these cases and the text file might not work as expected in programs running on the server consuming that text file because they are confused by the still embedded Unix-style line endings (LF).
 +
 
 +
In yet another example, a Windows (CR+LF) text file was uploaded to a Unix-based FTP server in binary. If that file is downloaded in ASCII, the FTP server translates LF to CR+LF so the CR+LF line endings will be converted to CR+CR+LF. FileZilla on Windows does expect the file to already use CR+LF line encoding (per FTP specification), so no more translation is done. Depending on the text editor used, lines might be separated by an additional empty line now.
 +
 
 +
== Changing the data type in FileZilla ==
 +
 
 +
You can change the transfer data type in three ways with FileZilla:
 +
 
 +
* In the preferences of FileZilla
 +
* In the main menu under ''Transfer'' -> ''Transfer type''
 +
* By right-clicking the data type indicator in the status bar of FileZilla.

Revision as of 09:34, 9 February 2017

Files can be transferred between an FTP client and server in different ways. The FTP specification (RFC 959) calls them "data type", but they are commonly referred to as "transfer mode", even though this is not correct.

The different data types are:

  • ASCII
  • binary (called "image" in the specification)
  • EBCDIC
  • local

But most of the time, however, only ASCII and binary types are used or even implemented.

ASCII type is used to transfer text files. The problem with text files is that different platforms have different kinds of line endings. Microsoft Windows for example uses a CR+LF pair (carriage return and line feed), while Unix(-like) systems, including Linux and MacOS X, only use LF and traditional MacOS systems (MacOS 9 or older) only use CR. The purpose of ASCII type is to ensure that line endings are properly changed to what is right on the platform. According to the FTP specification, ASCII files are always transferred using a CR+LF pair as line ending.

So in case the file is transferred from the client to the server, the client has to make sure CR+LF is used. Therefore it has to add nothing (on Microsoft Windows), add CR (on Unix) or add LF (on legacy MacOS) to each line ending. The server then adjusts the line ending again to what is used on the platform the server runs at. If it is Microsoft Windows, nothing has to be removed, while on Unix the superfluous CR is removed and on legacy MacOS the unneeded LF.

The same happens when a file is downloaded from the server to the client: the server makes sure the line endings are CR+LF when sending the file and the client then strips away whatever is not needed as line ending on its platform.

Because the file is changed if client and server are not running on the same kind of platform, this data type cannot be used for files with arbitrary characters, so called binary files, like images and videos. If it is used anyway, the binary files most likely are corrupted and won't work as expected anymore.

Compared to ASCII type, binary type is the easier one: the file is just transferred as-is, and no line ending translation is done.

So when you are not sure what to use, always go for binary type. Nowadays, nearly all (good) text editors can handle the three possible line endings, and other textual files like the ones of scripting languages such as Perl or PHP, as well as XML files (nearly) always work with any line ending as well.

Example

Client system: Windows (CRLF line endings).

Server system: Some Linux distribution (LF line endings).

If you upload a text file with 200 lines and a total size of 5768 bytes, it will have a size of 5568 bytes on the server.

Note

FileZilla does not analyse files uploaded as ASCII in any way. So if you have mixed line endings, somewhat "unexpected" things can happen. The native line ending for Windows is CR+LF. As this is what the FTP server expects when transferring files in ASCII, FileZilla on Windows does not apply any line ending translation at all. Now, imagine there is a text file with mixed Windows (CR+LF) and Unix (LF) line endings. Uploading that file from a Windows-based system to a Unix-based system will result in all CR+LF translated to LF only. Downloading that file again will make the FTP server convert all LF to CR+LF while sending it to FileZilla. As a result, all LF effectively are converted to CR+LF.

Another example is a text file with mixed line endings. FileZilla on Windows uploads that file to an FTP server running on Windows - no line ending conversion is done at all. Some text editors transparently handle mixed line endings so in such a text editor, the text file looks fine. However, other programs do not handle these cases and the text file might not work as expected in programs running on the server consuming that text file because they are confused by the still embedded Unix-style line endings (LF).

In yet another example, a Windows (CR+LF) text file was uploaded to a Unix-based FTP server in binary. If that file is downloaded in ASCII, the FTP server translates LF to CR+LF so the CR+LF line endings will be converted to CR+CR+LF. FileZilla on Windows does expect the file to already use CR+LF line encoding (per FTP specification), so no more translation is done. Depending on the text editor used, lines might be separated by an additional empty line now.

Changing the data type in FileZilla

You can change the transfer data type in three ways with FileZilla:

  • In the preferences of FileZilla
  • In the main menu under Transfer -> Transfer type
  • By right-clicking the data type indicator in the status bar of FileZilla.