Difference between revisions of "Data Type"

From FileZilla Wiki
Jump to navigationJump to search
(dtJTnpJOqjdatUWvNw)
m (Reverted edits by 198.101.245.16 (talk) to last revision by CodeSquid)
Line 1: Line 1:
Mr Rocks, that's because this relaese is not built against the headers and what not for the relaese of 10.4.11, but I noticed you found this out after finding the older wget binary I did compile against 10.4.xIf I find some time, I will install a 10.4.x system and compile/build against the system for the newer version.
+
Files can be transferred between an FTP client and server in different ways. The FTP specification ([http://filezilla-project.org/specs/rfc0959.txt <nowiki>RFC 959</nowiki>]) calls them "data type", but they are commonly referred to as "transfer mode", even though this is not correct.
 +
 
 +
The different data types are:
 +
*ASCII
 +
*binary (called "image" in the specification)
 +
*EBCDIC
 +
*local
 +
 
 +
But most of the time, however, only ASCII and binary types are used or even implemented.
 +
 
 +
ASCII type is used to transfer text files. The problem with text files is that different platforms have different kinds of line endings. Microsoft Windows for example uses a CR+LF pair (carriage return and line feed), while Unix(-like) systems, including Linux and MacOS X, only use LF and traditional MacOS systems (MacOS 9 or older) only use CR. The purpose of ASCII type is to ensure that line endings are properly changed to what is right on the platform. According to the FTP specification, ASCII files are always transferred using a CR+LF pair as line ending.
 +
 
 +
So in case the file is transferred from the client to the server, the client has to make sure CR+LF is used. Therefore it has to add nothing (on Microsoft Windows), add CR (on Unix) or add LF (on legacy MacOS) to each line ending. The server then adjusts the line ending again to what is used on the platform the server runs at. If it is Microsoft Windows, nothing has to be removed, while on Unix the superfluous CR is removed and on legacy MacOS the unneeded LF.
 +
 
 +
The same happens when a file is downloaded from the server to the client: the server makes sure the line endings are CR+LF when sending the file and the client then strips away whatever is not needed as line ending on its platform.
 +
 
 +
Because the file is changed if client and server are not running on the same kind of platform, this data type cannot be used for files with arbitrary characters, so called binary files, like images and videos. If it is used anyway, the binary files most likely are corrupted and won't work as expected anymore.
 +
 
 +
Compared to ASCII type, binary type is the easier one: the file is just transferred as-is, and no line ending translation is done.
 +
 
 +
So when you are not sure what to use, always go for binary type. Nowadays, nearly all (good) text editors can handle the three possible line endings, and other textual files like the ones of scripting languages such as Perl or PHP, as well as XML files (nearly) always work with any line ending as well.
 +
 
 +
== Example ==
 +
 
 +
Client system: Windows (CRLF line endings)
 +
 
 +
Server system: Some Linux distribution (LF line endings)
 +
 
 +
If you upload a text file with 200 lines and a total size of 5768 bytes, it will have a size of 5568 bytes on the server.

Revision as of 20:27, 27 September 2012

Files can be transferred between an FTP client and server in different ways. The FTP specification (RFC 959) calls them "data type", but they are commonly referred to as "transfer mode", even though this is not correct.

The different data types are:

  • ASCII
  • binary (called "image" in the specification)
  • EBCDIC
  • local

But most of the time, however, only ASCII and binary types are used or even implemented.

ASCII type is used to transfer text files. The problem with text files is that different platforms have different kinds of line endings. Microsoft Windows for example uses a CR+LF pair (carriage return and line feed), while Unix(-like) systems, including Linux and MacOS X, only use LF and traditional MacOS systems (MacOS 9 or older) only use CR. The purpose of ASCII type is to ensure that line endings are properly changed to what is right on the platform. According to the FTP specification, ASCII files are always transferred using a CR+LF pair as line ending.

So in case the file is transferred from the client to the server, the client has to make sure CR+LF is used. Therefore it has to add nothing (on Microsoft Windows), add CR (on Unix) or add LF (on legacy MacOS) to each line ending. The server then adjusts the line ending again to what is used on the platform the server runs at. If it is Microsoft Windows, nothing has to be removed, while on Unix the superfluous CR is removed and on legacy MacOS the unneeded LF.

The same happens when a file is downloaded from the server to the client: the server makes sure the line endings are CR+LF when sending the file and the client then strips away whatever is not needed as line ending on its platform.

Because the file is changed if client and server are not running on the same kind of platform, this data type cannot be used for files with arbitrary characters, so called binary files, like images and videos. If it is used anyway, the binary files most likely are corrupted and won't work as expected anymore.

Compared to ASCII type, binary type is the easier one: the file is just transferred as-is, and no line ending translation is done.

So when you are not sure what to use, always go for binary type. Nowadays, nearly all (good) text editors can handle the three possible line endings, and other textual files like the ones of scripting languages such as Perl or PHP, as well as XML files (nearly) always work with any line ending as well.

Example

Client system: Windows (CRLF line endings)

Server system: Some Linux distribution (LF line endings)

If you upload a text file with 200 lines and a total size of 5768 bytes, it will have a size of 5568 bytes on the server.