Programming: Issue #6 [Tutorial][Python]: Basic Python Variables

python-logo

I’ve realised many people doesn’t have a clear idea on how python works, either for having experience in other languages and currently adapting to python or due having no experience at all. So I’ve decided to make a rework on the blog structure and create more sections.

First of all: Python is a “duck-typed” programming language, which means it doesn’t require to give an explicit type to a variable when it is created. Instead, Python assigns a type to that variable internally.

It’s not the same writting this:

var = 10

Than this:

var = 10.

In the first case we’re implementing a variable that contains an integer. In the second one, the variable contains a float.

Variable types can be checked using the built-in function type():

>>> type(var1)

There is a list of Python basic types:

– String (str):
Contains an array of characters.

>>> var1 = “Random text 12345 inside var1”

– Integer (int):
Variable for integers. Python has no limits on integer sizes.

>>> var2 = 1234567890133713370987654321

– Float:
Contains a decimal number.

>>> var3 = 13.37

– Complex:
A complex number in the “real+complex” form. Note Python uses “a+bj” notation.

>>> var4 = 42+7j

– List:
A mutable list of elements. Elements on the same list do not require to have the same types. Note it can contain no elements.

>>> var5 = [123, “hello”, 1.234, 3+3j]

>>> var6 = []

– Tuple:
Inmutable tuple of elements. Can hold different tipes of elements but the structure can’t change unless overwritten.

>>> var7 = (1234, “Hello World”)

– Dictionary(dict):
A mutable array that holds variables by pairs. Each entry can be seen as a tuple holding a key and an answer. When accesing the dictionary with a given key, the dictionary searches that key entry and returns it’s related ‘answer’. Both values on each pair can have different types between themselves and other pairs. Dictionaries have no strict placement order, meaning adding new pairs doesn’t place them at the end of the dictionary.

>>> var8 = {“ok”: 1234, “nook”: “it’s not ok”}

>>> var8[“ok”]
1234
>>> var8[“nook”]
“it’s not ok”

– Boolean(bool):
A variable that can hold a True/False value.

>>> var9 = True

>>> var10 = False

This is all as far as it goes concerning basic variable types. I don’t see how this can be improved as going deeper in one type requires more than a post to do it right, so I guess I’ll make separate sections for more complex types, aka Tuples, Lists and Dictionaries.

Posted in Programming | Tagged , , , , , , , , | Leave a comment

Programming: Issue #5 [Cryptography][I/O][Python]: File Cryptography

After a brief delay due some crappy Java work that had to be done, here I am again with another Issue about cryptography in Python. This one is going to have a real-world utility, as the code I’m going to explain here can be used to encrypt and decrypt any kind of file. As of myself, I’ve tested it on various short .txt files trying to catch extreme cases and also with some .mp3 files, everything under Linux system.

Here it is the encoder function:


from Crypto.Cipher import AES
import hashlib
import os

def fileEncoderAES(password, fileinput, fileoutput, padding=" "):
	# Functions to handle files.
	# We delete any existant 'fileinput' on the directory,
	# to avoid data overlapping.
	f = open(fileinput,'r')
	if(os.path.isfile(fileoutput)):
		os.remove(fileoutput)
	f2 = open(fileoutput, 'w')
	
	# Symmetric-Key and variable initialization.
	r = "$"
	blocksize = 128
	filetotal = ""
	key = bytes(hashlib.sha256(password).digest())
	mode = AES.MODE_CBC #  ECB CBC
	encryptor = AES.new(key, mode, key[:8] + key[-8:])

	# Main iteration. Encodes blocksize-bit blocks until end of file.
	while(r != ""):
		r = f.read(blocksize)
		if(len(r) < blocksize):
			# Fill last block with padding character
			r = r + (blocksize-len(r)) * padding
			r = encryptor.encrypt(r)
			filetotal = filetotal + r
			break
		else:
			r = encryptor.encrypt(r)
			filetotal = filetotal + r

	# Save and close files
	f2.write(filetotal)		
	f.close()
	f2.close()

First thing we have to do is to specify and load the files that are going to used for the encoding. We need the file that we want to encode and the destination file. Both of their filenames are passed to the function as paramaters.

If there is any file with the same name we setted up for the destination file, this program will destroy it. This is to avoid having corrupted files, as Python does not totally overwrite a file when a function writes on an already existing file. If we have a file that contains an “123456789” string and we write directly on it a “bbb” string, the resulting file won’t contain just “bbb”, but “bbb456789”. Since it’s not the main purpose of this Issue to manage files and I didn’t need more complex handling thant this for myself, I just kept it simple.

Note that the second parameter on the open() function establishes how the files is going to be used. “r” stands for read-only and “w” stands for write-only. There are others, including appending and read/write.

In this function, we receive a password as a parameter. With the

key = bytes(hashlib.sha256(password).digest())

line we set up a fixed size key out from our variable-lenght password, meaning that any password of any lenght will fit on this to generate a key. MD5 or any other algorithm would work fine aswell.


mode = AES.MODE_CBC # ECB CBC
encryptor = AES.new(key, mode, key[:8] + key[-8:])

There is another difference from the first cryptography issue. Here I’m using CBC mode to generate the AES algorithm. ECB keys are simple and do not change at all over the course of the encryption. On the other hand, CBC-mode keys variate on each iteration, meaning that two consecutive “xxxxxxxx” strings wouldn’t be encoded as the same string. This offers much increased protection over brute-force attacks on files that tend to have some data repeated. In order to use the CBC mode the key requires a initialization vector with a 16bit lenght. Since it’s better if this one is pseudorandom, I just made it so it takes the first and last 8 digits of our key, so we don’t have to manually define it.

The main iteration is pretty simple: it keeps getting blocks of 128 characters from the input file, encoding them and storing them in a variable, and repeating this process until the end of the file is reached. If the last block has a lenght inferior to 128 characters, we replenish it with a padding character (blank spaces by default) until it reaches a lenght of 128 characters, and then we store it and stop the iteration.

Last sentences on the function are used to close the files on the program and free the space they were using.

As of the Decoder function, it’s pretty much the same as the Encoder inverted. The only thing that changes is the padding handling, since in this one we have to take it out rather than adding it. The main iteration is not as complex since we know we’ll always receive a file that is multiple of 128 characters due the padding added on the Encoder function, so we don’t have to create anything to handle that special case.


def fileDecoderAES(password, fileinput, fileoutput, padding=" "):
	# Functions to handle files.
	# We delete any existant 'fileinput' on the directory,
	# to avoid data overlapping.
	if(os.path.isfile(fileoutput)):
		os.remove(fileoutput)
	filetotal = ""
	f3 = open(fileinput, 'r')
	f4 = open(fileoutput, 'w')

	# Symmetric-Key and variable initialization.
	r = '$'
	blocksize = 128
	key = bytes(hashlib.sha256(password).digest())
	mode = AES.MODE_CBC #  ECB CBC
	decryptor = AES.new(key, mode, key[:8] + key[-8:])

	# Main iteration. Decodes blocksize-bit blocks until end of file.
	while(r != ""):
		r = f3.read(blocksize)
		r = decryptor.decrypt(r)	
		filetotal = filetotal + r

	# Clean padding at the end of file.
	while(filetotal[-1:] == padding):
		filetotal = filetotal[:-1] 

	# Save and close files
	f4.write(filetotal)
	f3.close()
	f4.close()

On this library I’ve also included a function to calculate a file’s MD5, just to be sure I was correctly handling paddings. If both files have the same MD5, you can be quite sure they have the same content.


def fileMD5(fileinput):
	f = open(fileinput, 'r')
	return hashlib.md5(f.read()).hexdigest()
	f.close()

I’ll probably make another post more focused on file handling since I guess it can be kinda useful as I had some troubles with it when doing this one (found the ‘overwriting error’ the hard way)

I’m quite excited about this issue as it is the first one to have a real use that anyone can benefit from. The complete library about this Issue can be located here:

Issue 5 files

P.S: When encoding data, the output file doesn’t really matter. I’ve always tested it creating a .txt file with the encoded data and worked well for all file extensions.

Posted in Programming | Tagged , , , , , , | Leave a comment

Programming: Issue #4 [Cryptography][Python]: Basic Cryptography

– There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files. – Bruce Schneier


Enigma machine, used by WWII German Army

 

Cryptography has been one of the topics about computer science that has captivated me the most so far. It is a subject of notable importance on computer engineering, but it’s crucial when it comes to networking. If someone is using a system with a decent firewall and it’s not exposed to external attacks, regular data doesn’t need to be encoded. However, we can’t assume anything when it comes to data traveling between networks, scpecially Internet. There are many kinds of attacks (Man-In-The-Middle as the first example that comes to mind) that focus on trying to intercept and/or manipulate data transfers between two computers. Cryptography comes as a solution that does not stop the attacker from getting the message, but keeps him from being able to read or understand it.

Since the whole project of this blog started as a step-by-step tutorial on how to make a program to transfer data between a client and a server (though I guess it won’t end there), even if we’re not going to send crucial data (such as credit card numbers, etc…) it is a matter of interest to understand and know how to use basic cryptography to send encoded data instead of plain text.

There are two main types of cryptography: symetric-key and public-key cryptography. In this first issue about cryptography I will show the main ideas behind the two systems and provide a cople of simple examples about them using AES for the symmetric-key section and RSA for the public-key one. I’m not going to go in-depth with the algorithms they use since it’s not the focus on the blog to show this kind of low-level view, but it’s easily googleable by anyone interested.

I’ll be using pycrypto libraries that can be downloaded for free at their webpage.

Symmetric-key Cryptography:

Symmetric-key cryptography are algorithms that use the same key for encoding and decoding the data between both parties. The idea behind them and their implementation is more simple than the public-key algorithms, and this also gives them better encoding/decoding times and their keys don’t need to be as large as public system keys to be considered secure.

Here is a brief example of AES:

Which will return:

>>> @<I��*���3ܤ1aT��6�+

>>> This is a text message. Hello!!!

As intended, ciphertext variable contains an unreadable string, and we get the original string after decoding.

This code is pretty simple and it’s not difficult to understand, since most of the work occurs on the pycrypto libraries and we have no need to know about them as users.

– AES.new() creates a new key with a password given in the first parameter. The second parameter establishes a “sub-algorithm” used to increase security by preventing the same string to be encoded as the same ciphertext twice. However this is a very easy example and I’m using ECB mode which does not provide any special security. On the next issue I’ll show and use other parameters to explain how do they work.

– Note that the password has to have a length of either 16, 24 or 32 bytes. The longest the key, the greatest the security. I’ll cover how to generate one of these later if you want to use a password that has not that exact number of characters.

– Message variable is just a regular python string that we’ll encode and decode. This needs to have a lenght multiple of 16. So what does this mean? We have to always send messages of 16, 32, 48, 64… characters? Yes, but we can simply make use of a padding with a character that we’re not using, like “{” or any other to convert a string from a “Hello!” to a “Hello!{{{{{{{{{{{” and use the algorithm the same way. I’ll cover more on this on the next issue.

– C = X.encrypt(M) simply takes the key X to encrypt a message M on C.

– Decrypt method is roughly the same inverted, we could have used the same key ‘obj’, I created an ‘obj2’ just to show how given the same password two keys would encode and decode the same message.

Public-key Cryptography:

Public-key cryptography differs from symmetric in that two different keys are used: one for encoding the text and the other for decoding the ciphertext. In this tutorial we will use it only for text encoding, but it can be used aswell to prove authenticity.

The key using for encoding is public, and is usually stored in some site open to the world. Anyone can go and take that key and use it to encrypt data. But only the person who generated that key will be able to decrypt that ciphertext with his second key, which he has to mantain private.

We’ll be using this kind of cryptography to send a private symmetric-key on a ciphertext message. Since symmetric keys are faster to use and more safer (comparing two keys with the same lenght), we’ll be using them to send endoced messages between clients and our server. But how can we set up that common, unic key? Simple: The server will generate a public-key and send the public part to the client and keep the private part for itself. Anyone will be able to intercept and read that message (we still have a problem of autenthication) but once the client generates a symmetric-key to send back to the server to use symmetric-encoding for data transfer, only the server will be able to read that message, and nobody else will.

Without further delay:

>>> ('a\x18\xca\x92"\x85\xe9g\xb2\xfd\xcf@\xc4\x08&\x86\xa2\xb4\xa5\xaf\xef.\xc3I\x9e\x91\xa8"{\xf1\xfc\xa6\xb9\xfc\xee\xb9\xddI\xb2L\xa3\xf5\x93b3\xbb\x99Q\xb3z\xef-\xe2\x12\xc2n\xc0ET\xdb\x8baj\xeb\xbe\x87NqyQ\x97\xd4\x89\x05#\xba\x10\xad`D\xa5\x17b\xfc\x9d\xf0Q^_/\xbbO\xbe\\\x01=\xad\xf6\x1fBb\x9fb\x8c\xd7)\xdd\xb03\x0bXG\xeaZ\xbf\xad|\x9a\xaa\xb0E!\xbc+\xb1jW\xd6',)
>>> SendotuxProgramming

– First we use a randomg generator to create a variable with a random value that we’ll use as a parameter to generate both keys.

– First parameter on RSA.generate is the key lenght. It has to be multiple of 256 and larger or equal than 1024. Note that this key lenght is vastly larger than the one we used on AES encoding. Look how the generated ciphertext is much larger than the plain text aswell.

– key.publikey() returns the public key part stored inside key. We can use this method to get the key and send or publish it, so anyone can send you encoded data that only you can decrypt.

– Finally we use that public key to encrypt de data and we use the main key (which includes both keys and other information) to decrypt it.

This concludes the first part about python cryptography using pycrypto, perhaps I’ll update this tutorial later if I see there was missing information somewhere. As always, here are the files used on the examples:

Issue 4 files

External links:

Pycrypto API database
Pycrypto Webpage
Pycrypto binaries for Windows

Posted in Programming | Tagged , , , , , , , | 2 Comments

Programming: Issue #3 [Classes][Python]: Basic Classes

The next step on the socket programming would be to create a server that can handle many connections at the same time. In order to do this, we’ll need to have multiple threads running on the server program. However, before moving onto threads, I think it can prove useful to explain the main concepts behind the implementation of classes.

This first example shows one of the minimal classes with a ‘real’ fucntionality one can create on Python:

This does nothing useful, but it’s a simple example to show a class creator and a function on it, and it’s good for the purpose how classes on python work by default.

– __init__ is the keywork for the class creator. To create a class object from the main program we’ll be using the x = ClassTest(n) command, where x refers to the name of the object we’re creating, and n is the value we pass to the constructor.

– printCT() is a function of the class that we can call on one of the objects we have created, that returns the value of the i variable inside the object. In this case, it does the same as calling x.i, since the i value on this object can be allowed directly from outside the object.

As an execution example:

———-
x = ClassTest(10)
y = Classtest(15)
print x.printCT() + x.i
———-

Would print on screen ’25’.

There is a problem with this with programmers coming from other languages such as C++ as myself. The class doesn’t have a section that specifies it’s atributtes. This is because python classes are mutable, all of them have a hidden dictionary (try printing ‘x.__dict__’!) that points to every class attribute we add. These attributes can be deleted aswell using the ‘del’ function. We can do that from outside the class with a ‘del x.i’ or from itself with a ‘del self.i’.

Example:
———-
x = ClassTest(15)
del x.i
print x.printCT()
>>> AttributeError: ClassTest instance has no attribute ‘i’
———-

Data Hiding:

On most cases we don’t want an attribute to be accessed from outside the class. This is when data hiding comes in. This example is the same as the first one, but we won’t be able to declare a x object and cast x.i. To hide a variable, we just have to add ‘__’ before it:

———-
x = ClassTest2(10)
print x.printCT()
>>> 10
print x.i
>>> AttributeError: ClassTest instance has no attribute ‘i’
———-

However, this attribute is not totally blocked from outside access. This is just a way to help on Object Oriented programming. Remember I said every class has a hidden __dict__ attribute on it? Let’s exploit that:

———-
x = ClassTest2(10)
print x.__dict__
>>> {‘_ClassTest__i’: 10}
———-

Voilà. We now know how to access that variable:

———-
x = ClassTest2(10)
print x._ClassTest__i
>>> 10
———-

This is just for learning purposes. The fact that you can still access that variable from outside doesn’t mean you have/want to do it, we’re doing this to avoid outside access.

Unmutable classes:

Having a dict attribute to enable mutability comes at cost of having unefficient code (you can’t have it all). Sometimes we need to have a class more focused on being static and saving resources, since we’re going to have a lot of instances of that class and every byte counts.

To do this, we have the __slots__ keyword.

Now this is a more complex and useful (sort of…) class.

– Note that I’ve changed the class to ‘ClassTest(object)’. This is required for any class you plan to use in a real program. While I don’t know exactly everything the object inheritance does, I can tell that it’s at very least required for __slots__ to work. If the object class is not included, __slots__ does nothing.

– I’ve now defined setters and getters functions for the only attribute of the class, since it is private. Unless we declare these functions, there is ‘no way‘ to access or edit those values.

– Slots can be used to define non-hidden values, we could implement a ‘__slots__ = [“__i”, “j”]’ and one of the attributes would be private and the other wouldn’t.

– This class cannot get more variables, neither by the usage of a “x.j = n” or “self.j = n”, it is limited to the variables defined on the slots sentence.

– Using __slots__ destroys the __dict__ structure since it’s not longer needed.

It seems like __slots__ gives problems when using inheritance between objects. I’ve readed should be used only as a way to save time/memory on very specific programs, not as a way to make a class unable to mutate, so my advice is to use it or not depending on personal preference (a lot of information about this can be found on a single google query, if you really require to know this on detail) as long as it doesn’t throw errors.

This is everything regarding basic class implementation by now. I’m actually uploading the .py files to MediaFire and I expect to have them linked here in a brief moment.

Issue 3 files

Posted in Programming | Tagged , , , | Leave a comment

Programming: Issue #2 [Sockets][Time][Python]: Basic Sockets 2

Today I’m going to explain a socket program that is kind of more useful than first one, since a client that does not receive any data back from the server has very limited uses (it can still be used as a remote controller, but you can’t even tell if server has received correctly the commands/data sent to it).

For this one I’m going to use the Time library. The client will be able to make a request to the server for it to send back his local time. This will require the use of a bidirectional socket, as well as a function to calculate and format the time data.

Let’s begin with the server code:

I assume you’re reading this after having seen Issue #1, so the new part starts after the ‘res = add.recv(255)’ line. Here I’m using a basic control flow, the ‘if’ sentence, to check if the client has send a string that is exactly “time”. If this is the case, the return message will be the server’s local time that we will get from another python file. Otherwise, we will tell the client the command was incorrect.

Why use another python file for the time function?

This is basically to keep the server code clean and to be able to reuse that function code without needing to import this very own server file. Having another file for the functions will also allow us to keep many other funcions the server will handle separated from the basic iteration code. Note the ‘import’ at the begining of the file making reference to the file that contains the function, server_functions.py (the import sentence doesn’t need the .py extension).

After the command has been processed, comes another flow control sentence. This contains a call to the sleep method from the time library, which stops the process for the value provided as a parameter, in seconds. I do this just to make sure the client has time to reach the LISTENING state on it’s port so the connection doesn’t break. If we skip this, it is possible that the server sends the data before the client is ready to receive it, specially using loopback direction on the same machine,as we cannot tell which process will end faster. The 3 seconds of this example are too much for this, probably half a second delay would be far enough.

The last sentence before closing the socket sends the variable msg back to the client. Note we’re using the ‘add’ variable, which contains information about the connection from the client to send him a response.

The client file is pretty much the same as in Issue #1, it just adds another line to receive data -like as we did on the server to send it back- from the server and print its content before closing the socket:

And here’s the code for the time function:

We could just send back the ‘hour’ list, the ‘for’ iteration is just to make sure it always returns hour in HH:MM:SS format. If we don’t edit this way (or any other that equals this) it could return HH:M:SS or H:MM:S when the hour/minute/second values are less than 10.

After this tutorial, anyone who has researched a little bit about sockets or that is really understanding what is reading here and knows how they work could ask “And how does the client receive data back if we haven’t set a socket number nor ip adress for it?”. And that’s a good question. The server socket needs to know which port must be the one used for listening, but the client does not need to have a specific port number for itself to send the data, anyone works fine. So it basically takes a random port assignated by the operating system. If we want this not to be random, we can add a bind() sentence to the client port as we do with the server one after creating it.

This ends with this second issue of the programming blog.

Issue 2 files

Posted in Programming | Tagged , , , , , | Leave a comment

Programming: Issue #1 [Sockets][Python]: Basic Sockets

I will start the programming section with a small post about socket programming on python. This very first post intends to explain the basics of sockets by showing a very simple program.

Sockets are low-level communication structures used to transfer data between programs, managed by the operative system. Each socket has an assigned IP (we’ll be using IPv4 directions, though IPv6 can be used aswell) and port number (which ranges from 1024 to 65535, first 1024 ports are reserved for the system and we can’t use them unless logged as superusers).

In order to transfer data from one application to another, we need an active port on both programs. We’ll be using a server-client model for this tutorial. Basically the idea is to have a server which accepts connections from clients. Then the clients send some command to the server, it does whatever it’s needed with the received data.

Starting with a simple server application, it should look like this:


----------
#server.py
import socket

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("127.0.0.1", 8000))
server.listen(1)

add, port = server.accept()
print add.recv(255)

server.close()
----------

As most programs below 10 lines, it does nothing amazing. Going in a more in-depth explanation:

– First line is the import for the socket library on python, required to use the socket interface.

– socket() function is used to create a socket. In this case we’re creating a socket for transfering data (socket.AF_INET) and using TCP protocol (socket.SOCK_STREAM). This function can be called without parameters and it will still work, though I don’t know right now the default options, it could be using UDP instead of TCP.

– bind() function links the socket to a direction. In this case, we’ll be using the loopback adress 127.0.0.1 for the IP adress and the port number 8000, which is just a random number above 1024, any other will work fine aswell.

– listen() sets the port on LISTENING state. The number passed as parameter indicates the maximum number of allowed connections. In this case, 1 will work because we’re using a code that will just receive a string and print it on screen.

– accept() catches any attempt to connect to that port and initalizes the add and port variables, which are references to the client adress. In this example we won’t be using them, but we’ll see how to use them in future tutorials. Until any connection is registered, accept() blocks program execution (meaning this program will never end until it receives a connection).

– add.recv() catches data input once a connection is established. The number indicates the lenght size we want to catch in every recv(). As with accept(), this function blocks program execution.

– close() closes the current socket, freeing it’s resources. We need to do this everytime we want to end a connection, though in this case, at the end of execution socked would be closed as the process would end.

Now let’s take a look a the client code:


----------
#client.py
import socket

client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(("127.0.0.1", 8000))
msg = raw_input()
client.send(msg)

client.close()
----------

Skipping the lines already explained on the server code:

– connect() is used to establish a connection between the socket created and another on a remote location. We have to give the function parameters for the IP adress and the port destination.
– send() sends a stream of characters to the destination socket. In order to this to work, receiving socket must be on LISTENING state.

This concludes the first tutorial on python programming. On next issues I’ll try to give out examples of Python code usage itself, not just socket-oriented concepts. The code on this tutorial works well, but I’ll try to upload the files once I figure out a simple way to upload .py files.

Issue 1 files

Posted in Programming | Tagged , , , | Leave a comment