handyfloss

Because FLOSS is handy, isn’t it?

Switch to self-hosted blog

Posted by isilanes on February 25, 2008

This blog moves to: handyfloss.net.

This weekend I signed up for a hosting service with DreamHost. By the way, I used a promo code from dh.promo-codes.us.

Along with the hosting, I obtained a free domain, which allows me for a shorter blog name, which is: handyfloss.net.

I am in the process of moving my stuff there. Moving the posts and comments was easy. Probably statistics are not transferable (maybe it wouldn’t make sense, anyway), and other things like widgets and the blogroll seem to have to be moved by hand. Oh, and obviously I have to work with the theme (the most important thing in any web :^)

I will not write here anymore (I’ll leave it behind, as I left isilanes.blogspot.com), so you will be able to follow my new posts at handyfloss.net. Expect theme/widget/whatever changes there for a while, until I settle down a bit. It was fun to be here at WordPress.com, and I hope the fun goes on with the WordPress.org software (the one that powers my handyfloss.net blog) on DreamHost (or anywhere else).

See you there!

Posted in my ego and me | Tagged: , , | 10 Comments »

A vueltas con el incremento de ancho de banda de Euskaltel

Posted by isilanes on February 21, 2008

Como el lector quizá sabrá, Euskaltel ha duplicado (y triplicado) la anchura de banda de todas o casi todas sus ofertas de conexión a Internet. Y lo ha hecho manteniendo los precios, lo cual es de agradecer (aunque no del todo soprendente, dado que llevaban más de 2 años sin cambiar su oferta).

Pues bien, la línea de 300 kb que tienen mis padres contratada, supuestamente ahora la ofrecen a 1Mb. ¡Genial, el triple de velocidad por el mismo precio! Bueno, la realidad es que no es del todo cierto. Parece ser que cambiar la página web para ofrecer mayores velocidades es más fácil que realmente servir mayores velocidades, con lo cual hay una pequeña discrepancia entre lo ofertado y lo servido: mis padres siguen con 300 kb.

Decidí esperar hasta febrero para que tuvieran un tiempo “de gracia” para adecuar el servicio a la oferta, pero como ya estamos a finales, he decidido quejarme a través de su área de cliente.

Como admiro y respeto a Euskatel por su buen trato al cliente y eficiencia en el servicio, he querido homenajearlos publicando en el blog la conversación electrónica que estoy teniendo con ellos. De esta manera, mis lectores verán lo buenos que son en Euskaltel (o lo malos que son: en su mano está). Mi experiencia es una gota en el océano, pero con que un solo lector decida contratar Euskaltel por leer esto ya sentiré que he hecho algo por una compañia que se desvive por darme el mejor servicio posible.

Más posts sobre Euskaltel:

Mi queja original (18-02-2008):

Veo en su página (euskaltel.es), que el contrato Despega 300 se ha convertido en Despega 1M, manteniendo la tarifa. Mis padres tienen contratado dicho servicio, pero la velocidad de conexión sigue siendo de 300 kb. Quisiera saber qué tipo de error han cometido uds., bien sea por publicidad engañosa (si el error está en la página web), o deficiencia de servicio (si nos están dando un ancho de banda menor del contratado). Por supuesto, también me gustaría que dicho error sea subsanado cuanto antes.

Gracias de antemano.

Respuesta de Euskaltel (20-02-2008):

Estimado cliente:

En respuesta a la consulta que nos remite a traves de su correo, efectivamente el servicio de Internet Despega 300 Kbps a dejado de comercializarse para pasar a ser Despega 1 Mb por la misma cuota mensual.

En el caso de los clientes que tiene contratado el Despega 300 Kpbs, se les va a subir la velocidad a 1 Mg sin coste adicional, estas subidas de velocidad se estan realizando paulatinamente y esta previsto que para verano ya esten todos nuestros clientes con las velocidades actualizadas. No obstante, cuando se vaya a producir el aumento de velocidad, recibiran noticias por parte nuestras informandoles de dicho cambio.

Esperando haber aclarado sus dudas,

Reciba un cordial saludo,

Euskaltel, S.A.

Mi respuesta (21-02-2008):

Estimada Euskaltel,

Comprendo y respeto los motivos (aunque no se me expliquen) de Euskaltel para hacer una subida paulatina de velocidades a los clientes actuales (aunque esto suponga un agravio comparativo frente a los nuevos clientes, que lo obtienen inmediamente).

Como entiendo que Euskaltel es igual de compresiva o más que yo, supongo que no le importará que yo, a cambio, pague un tercio de mi cuota mensual habitual, ya que se me da un tercio de la velocidad contratada (me dan el ancho de banda del momento que firmé, pero no el ACTUAL del servicio que contraté). Por supuesto, y al igual que Euskaltel conmigo, iré aumentando “paulatinamente” mi aporte mensual a Euskaltel, y espero (salvo imprevistos) pagar el 100% de mi cuota “para verano”, cuando previsiblemente uds. me darán el 100% del servicio contratado.

Iñaki

P.D.: pueden uds. seguir esta conversación, al igual que todos mis lectores, en mi blog: http://handyfloss.wordpress.com/2008/02/21/a-vueltas-con-el-incremento-de-ancho-de-banda-de-euskaltel/

Respuesta Euskaltel (22-02-2008)

Estimado cliente:

En respuesta a la consulta que nos remite a traves de su mensaje, le informamos de que Euskaltel cuando comunico el aumento de velocidad que aplicaria sobre los servicios ya contratados por los clientes sin modificar las cuotas, tambien comunico que el cambio se aplicaria de forma escalonada durante los proximos meses. Le informamos tambien que para este tipo de cambios, la ley tiene un plazo estipulado de 6 meses.

Asi mismo, Euskaltel tambien comunico que a partir de ese momento la velocidad minima que ofreceria seria 1M.

A los clientes que tengan contratado el servicio despega 300kb, Euskaltel les aumentara la velocidad de conexion sin que por ello se incremente la cuota mensual, lo cual no perjudica al cliente en ningun momento. Ni reducira la cuota a un tercio, puesto que Euskaltel en ningun momento comunico que modificaria la cuota mensual sobre los servicios de banda ancha contratados manteniendo la velocidad que en breve quedara obsoleta, sino que aumentaria la velocidad manteniendo la cuota mensual.

Tambien le recordamos que a los clientes se les esta ofreciendo el ancho de banda que contrataron, como Vd. bien dice, asta(sic) que Euskaltel les aplique el aumento de velocidad cuando llegue el momento; y se les notificara dicho cambio.

Reciba un cordial saludo,

Euskaltel, S.A.

Mi respuesta (22-02-2008):

Estimada Euskaltel,

En ningún momento he dudado de que tuvieran uds. la ley de su parte. Es más, estaba totalmente seguro de que si la ley les permitía retrasar el aumento de velocidad prometido 6 meses, se tomarían uds. los 6 meses, como admiten que harán. Si les hubiera permitido tomarse 12 meses, se habríán tomado 12, obviamente. Todo ello por dar el mejor servicio posible a sus clientes, ¡faltaría más!

Soy reticente a tomarme mi aumento de velocidad “manteniendo la cuota” (como tanto repiten), como un regalo que Euskaltel me hace en su infinita bondad. Más bien me lo tomo como obligación legal de no discriminación de unos clientes frente a otros, ya que (por motivos de negocio) han actualizado sus obsoletas tarifas (llevaban más de 2 años congeladas) para los nuevos clientes, y (mal que les pese) no pueden tener doble tarificación para clientes nuevos y viejos. Por tanto, se ven obligados a aumentarme el ancho de banda, y lo van a hacer lo más tarde que les permite la ley. Así que excúsenme que no les dé las gracias.

La única duda que me queda es la justificación moral (ya que legal parece haber) para ofrecer un servicio mejor a los nuevos clientes, con el consiguiente agravio comparativo para los clientes actuales. Parece que en vez de premiar la fidelidad prefieren insultarla.

Dada su política, lo más sabio por mi parte sería darme de baja, e inmediatamente darme de alta, para poder beneficiarme de su actual tarifa. Claro que no dudo de que uds. contarán con innumerables salvaguardas legales para obstaculizarme dicha operación lo más posible, retrasando la baja tanto como la ley les permita, de manera que no me saliese ventajoso hacer eso.

¿Leyendo mis argumentos les parece a uds. que están trabajando por tener contentos a los clientes?

Mi humilde consejo, para la próxima vez, es que si van a hacer un cambio de tarifas o servicios, lo hagan para TODOS los clientes simultáneamente (si no pueden, esperen hasta poder), y hagan el anuncio del cambio 1 minuto DESPUÉS de efectuarlo. Créanme, nadie les denunciará por haber duplicado el ancho de banda sin avisar. Avisar sin duplicar, por el contrario, sí puede ser constitutivo de delito (o al menos grave falta a los ojos de los clientes).

Atentamente,

Iñaki

Posted in This evil world | Tagged: , , , , | 6 Comments »

Some more tweaks to my Python script

Posted by isilanes on February 19, 2008

All the comments to my previous post have provided me with hints to increase further the efficiency of a script I am working on. Here I present the advices I have followed, and the speed gain they provided me. I will speak of “speedup”, instead of timing, because this second set of tests has been made in a different computer. The “base” speed will be the last value of my previous test set (1.5 sec in that computer, 1.66 in this one). A speedup of “2” will thus mean half an execution time (0.83 s in this computer).

Version 6: Andrew Dalke suggested the substitution of:

line = re.sub('>','<',line)

with:

line   = line.replace('>','<')

Avoiding the re module seems to speed up things, if we are searching for fixed strings, so the additional features of the re module are not needed.

This is true, and I got a speedup of 1.37.

Version 7: Andrew Dalke also suggested substituting:

search_cre = re.compile(r'total_credit').search
if search_cre(line):

with:

if 'total_credit' in line:

This is more readable, more concise, and apparently faster. Doing it increases the speedup to 1.50.

Version 8: Andrew Dalke also proposed flattening some variables, and specifically avoiding dictionary search inside loops. I went further than his advice, even, and substituted:

stat['win'] = [0,0]

loop
  stat['win'][0] = something
  stat['win'][1] = somethingelse

with:

win_stat_0 = 0
win_stat_1 = 0

loop
  win_stat_0 = something
  win_stat_1 = somethingelse

This pushed the speedup futher up, to 1.54.

Version 9: Justin proposed reducing the number of times some patterns were matched, and extract some info more directly. I attained that by substituting:

loop:
  if 'total_credit' in line:
    line   = line.replace('>','<')
    aline  = line.split('<')
    credit = float(aline[2])

with:

pattern    = r'total_credit>([^<]+)<';
search_cre = re.compile(pattern).search

loop:
  if 'total_credit' in line:
    cre    = search_cre(line)
    credit = float(cre.group(1))

This trick saved enough to increase the speedup to 1.62.

Version 10: The next tweak was an idea of mine. I was diggesting a huge log file with zcat and grep, to produce a smaller intermediate file, which Python would process. The structure of this intermediate file is of alternating lines with “total_credit” then “os_name” then “total_credit”, and so on. When processing this file with Python, I was searching the line for “total_credit” to differentiate between these two lines, like this:

for line in f:
  if 'total_credit' in line:
    do something
  else:
    do somethingelse

But the alternating structure of my input would allow me to do:

odd = True
for line in f:
  if odd:
    do something
    odd = False
  else:
    do somethingelse
    odd = True

Presumably, checking falsity of a boolean is faster than matching a pattern, although in this case the gain was not huge: the speedup went up to 1.63.

Version 11: Another clever suggestion by Andrew Dalke was to avoid using the intermediate file, and use os.popen to connect to and read from the zcat/grep command directly. Thus, I substituted:

os.system('zcat host.gz | grep -F -e total_credit -e os_name > '+tmp)

f = open(tmp)
for line in f:
  do something

with:

f = os.popen('zcat host.gz | grep -F -e total_credit -e os_name')

for line in f:
  do something

This saves disk I/O time, and the performance is increased accordingly. The speedup goes up to 1.98.

All the values I have given are for a sample log (from MalariaControl.net) with 7 MB of gzipped info (49 MB uncompressed). I also tested my scripts with a 267 MB gzipped (1.8 GB uncompressed) log (from SETI@home), and a plot of speedups vs. versions follows:

versions2.png

Execution speedup vs. version
(click to enlarge)

Notice how the last modification (avoiding the temporary file) is of much more importance for the bigger file than for the smaller one. Recall also that the odd/even modification (version 10) is of very little importance for the small file, but quite efficient for the big file (compare it with Version 9).

The plot doesn’t tell (it compares versions with the same input, not one input with the other), but my eleventh version of the script runs the 267 MB log faster than the 7 MB one with Version 1! For the 7 MB input, the overall speedup from Version 1 to Version 11 is above 50.

Posted in howto | Tagged: , , , , | 7 Comments »

Summary of my Python optimization adventures

Posted by isilanes on February 17, 2008

Blog moved to: handyfloss.net

Entry available at: http://handyfloss.net/2008.02/summary-of-my-python-optimization-adventures/

This is a follow up to two previous posts. In the first one I spoke about saving memory by reading line-by-line, instead of all-at-once, and in the second one I recommended using Unix commands.

The script reads a host.gz log file from a given BOINC project (more precisely one I got from MalariaControl.net, because it is a small project, so its logs are also smaller), and extracts how many computers are running the project, and how much credit they are getting. The statistics are separated by operating system (Windows, Linux, MacOS and other).

Version 0

Here I read the whole file to RAM, then process it with Python alone. Running time: 34.1s.

#!/usr/bin/python

import os
import re
import gzip

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]
  
# Process file:
f = gzip.open('host.gz','r')
for line in f.readlines():
  if re.search('total_credit',line):
    # The following line lacks a '' behind the "total_credit" thing
    # because WordPress won't accept them (it keeps mangling the text 
    # if I do include them)
    credit = float(re.sub('/?total_credit','',line.split()[0])
  elif re.search('os_name',line):
    if re.search('Windows',line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif re.search('Linux',line):
        stat['lin'][0] += 1
        stat['lin'][1] += credit
    elif re.search('Darwin',line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit
f.close()

# Return output:
nstring = ''
cstring = ''
for osy in os_list:
  nstring +=   "%15.0f " % (stat[osy][0])
  try:
    cstring += "%15.0f " % (stat[osy][1])
  except:
    print osy,stat[osy]

print nstring
print cstring

Version 1

The only difference is a “for line in f:“, instead of “for line in f.readlines():“. This saves a LOT of memory, but is slower. Running time: 44.3s.

Version 2

In this version, I use precompiled regular expresions, and the time-saving is noticeable. Running time: 26.2s

#!/usr/bin/python

import os
import re
import gzip

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]


pattern    = r'total_credit'
match_cre  = re.compile(pattern).match
pattern    = r'os_name';
match_os   = re.compile(pattern).match
pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Process file:
f = gzip.open('host.gz','r')

for line in f:
  if match_cre(line,5):
    # The following line lacks a '' behind the "total_credit" thing
    # because WordPress won't accept them (it keeps mangling the text 
    # if I do include them)
    credit = float(re.sub('/?total_credit','',line.split()[0])
  elif match_os(line,5):
    if search_win(line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif search_lin(line):
      stat['lin'][0] += 1
      stat['lin'][1] += credit
    elif search_dar(line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit
f.close()

# etc.

Version 3

Later I decided to use AWK to perform the heaviest part: parsing the big file, to produce a second, smaller, file that Python will read. Running time: 14.8s.

#!/usr/bin/python

import os
import re

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]
  
pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Distile file with AWK:
tmp = 'bhs.tmp'
os.system('zcat host.gz | awk \'/total_credit/{printf $0}/os_name/{print}\' > '+tmp)

stat = {}
for osy in os_list:
  stat[osy] = [0,0]
# Process tmp file:
f = open(tmp)
for line in f:
  line = re.sub('>','<',line)
  aline = line.split('<')
  credit = float(aline[2])
  os_str = aline[6]
  if search_win(os_str):
    stat['win'][0] += 1
    stat['win'][1] += credit
  elif search_lin(os_str):
    stat['lin'][0] += 1
    stat['lin'][1] += credit
  elif search_dar(os_str):
    stat['dar'][0] += 1
    stat['dar'][1] += credit
  else:
    stat['oth'][0] += 1
    stat['oth'][1] += credit
f.close()

# etc

Version 4

Instead of using AWK, I decided to use grep, with the idea that nothing can beat this tool, when it comes to pattern matching. I was not disappointed. Running time: 5.4s.

#!/usr/bin/python

import os
import re

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]
  
pattern    = r'total_credit'
search_cre = re.compile(pattern).search

pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Distile file with grep:
tmp = 'bhs.tmp'
os.system('zcat host.gz | grep -e total_credit -e os_name > '+tmp)

# Process tmp file:
f = open(tmp)
for line in f:
  if search_cre(line):
    line = re.sub('>','<',line)
    aline = line.split('<')
    credit = float(aline[2])
  else:
    if search_win(line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif search_lin(line):
      stat['lin'][0] += 1
      stat['lin'][1] += credit
    elif search_dar(line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit

f.close()

# etc

Version 5

I was not completely happy yet. I discovered the -F flag for grep (in the man page), and decided to use it. This flag tells grep that the pattern we are using is a literal, so no expansion of it has to be made. Using the -F flag I further reduced the running time to: 1.5s.

time_vs_version.png

Running time vs. script version (Click to enlarge)

Posted in howto | Tagged: , , , , | 13 Comments »

Minipunto para Arsys

Posted by isilanes on February 17, 2008

Vaya por delante que no conozco nada de Arsys, y que (por ahora) no tengo nada que ver con ellos. Simplemente quería compartir el hecho de que he vistado su página (fantaseando con adquirir un dominio propio), y he visto esto:

arsys_ff.png

¿Nada raro? Pues fijáos en que, como buen servicio relacionado con Internet, tiene una fotico con un señor y un navegador web abierto… ¿Internet Explorer? Yo creo que no…

Posted in Free software and related beasts | Tagged: , , , , | 2 Comments »

Speeding up file processing with Unix commands

Posted by isilanes on February 17, 2008

Blog moved to: handyfloss.net

Entry available at: http://handyfloss.net/2008.02/speeding-up-file-processing-with-unix-commands/

In my last post I commented some changes I made to a Python script to process a file reducing the memory overhead related to reading the file directly to RAM.

I realized that the script needed much optimizing, and resorted to reading the link a reader (Paddy3118) was kind enough to point me to, I realized I could save time by compiling my search expressions. Basically my script opens a gzipped file, searches for lines containing some keywords, and uses the info read from those lines. The original script would take 44 seconds to process a 6.9 MB file (49 MB uncompressed). Using compile on the search expressions, this time went down to 29 s. I tried using match instead of search, and expressions like “if pattern in line:“, instead of re.search(), but these didn’t make much of a difference.

Later I thought that Unix commands such as grep were specially suited for the task, so I gave them a try. I modified my script to run in two steps: in the first one I used zcat and awk (called from within the script) to create a much smaller temporary file with only the lines containing the information I wanted. In a second step, I would process this file with standard Python code. This hybrid approach reduced the processing time to just 12 s. Sometimes using the best tool really makes a difference, and it seems that the Unix utilities are hard to come close to in terms of performance.

It is only after programming exercises like this one that one realizes how important writing good code is (something I will probably never do, but I try). For some reason I always think of Windows, and how Microsoft refuses to make an efficient program, relying on improvementes on the hardware instead. It’s as if I tried to speed up my first script using a faster computer, instead of fixing the code to be more efficient.

Posted in howto | Tagged: , , , | 1 Comment »

Python: speed vs. memory tradeoff reading files

Posted by isilanes on February 15, 2008

I was making a script to process some log file, and I basically wanted to go line by line, and act upon each line if some condition was met. For the task of reading files, I generally use readlines(), so my first try was:

f = open(filename,'r')
for line in f.readlines():
  if condition:
    do something
f.close()

However, I realized that as the size of the file read increased, the memory footprint of my script increased too, to the point of almost halting my computer when the size of the file was comparable to the available RAM (1GB).

Of course, Python hackers will frown at me, and say that I was doing something stupid… Probably so. I decided to try a different thing to reduce the memory usage, and did the following:

f = open(filename,'r')
for line in f:
  if condition:
    do something
f.close()

Both pieces of code look very similar, but pay a bit of attention and you’ll see the difference.

The problem with “f.readlines()” is that it reads the whole file and assigns lines to the elements of an (anonymous, in this case) array. Then, the for loops through the array, which is in memory. This leads to faster execution, because the file is read once and then forgotten, but requires more memory, because an array of the size of the file has to be created in the RAM.

fileread_memory

Fig. 1: Memory vs file size for both methods of reading the file

When you do “for line in f:“, you are effectively reading the lines one by one when you do each cycle of the loop. Hence, the memory use is effectively constant, and very low, albeit the disk is accessed more offten, and this usually leads to slower execution of the code.

fileread_time.png

Fig. 2: Execution time vs file size for both methods of reading the file

Posted in Free software and related beasts | Tagged: , , , , | 2 Comments »

Password cracking with John the Ripper

Posted by isilanes on February 10, 2008

Following some security policy updates (not necessarily for better) in my workplace, a colleague and I discussed the vulnerability of user passwords in the accounts of our computers. He assured that an attack with a cracker program such as John the Ripper could potentially break into someone’s account, if only the cracker would have access to an initial user account.

I am by no means an expert on cryptography and computer security, but I would like to outline some ideas about the subject here, and explain why my colleague was partially wrong.

How authentication works

When we log in to an account in a computer, we enter a password. The computer checks it, and if it is the correct one, we are granted access. For the computer to check the password, we must have told it beforehand what the correct password is. Now, if the computer knows our password, anyone with access to the place where it is stored could retrieve our password.

We can avoid that by not telling the computer our password, but only an encrypted version. The encrypted version can be obtained from the password, but there is no operation to obtain the password from its encrypted form. When the computer asks for a password, it applies the encrypting algorithm, and compares the result with the stored encrypted form. If they are equal, it infers that the password was correct, since only from the correct password could one obtain the encrypted form.

On the other hand, no-one can possibly obtain the original password, even by inspection of the contents of the host computer, because only the encrypted form is available there.

How password cracking works

I will only deal with brute force attacks, i.e., trying many passwords, until the correct one is found.

Despite the “romantic” idea that a cracker will try to log in to an account once and again, until she gets access, this method is really lame, since such repeated access tries can be detected and blocked.

The ideal approach is to somehow obtain the encrypted password that the computer stores, and then try (in the cracker’s computer) to obtain the plain password from it. To do so, the cracker will make a guess, encrypt it with the known encrypting method, and compare the result with the encrypted key, repeating the process until a match is found. This task is the one performed by tools such as John the Ripper.

Why this shouldn’t work in a safe (Linux) system

The security of a password relies heavily on the difficulty of guessing it by the cracker. If our password is the same as our user name, this will be the first guess of the cracker, and she’ll find it immediately. If our password is a word that appears in a dictionary, they’ll find it quickly. If it is a string of 12 unrelated characters, plus digits, dots or exclamation marks, then it will take ages for the cracking program to reach the point where it guesses it.

The second hurdle for the cracker is that, even if she gets access to a regular user account, the file where the encrypted passwords are stored is only readable by the root (administrator) user (in a Linux system). Information about users and their passwords is stored in /etc/passwd (that any user can read) and /etc/shadow (that only root can read). The encrypted password is stored only in the latter. In the past all info was in /etc/passwd, but later on it was split, to increase the security.

In short: you need root access to start trying to crack passwords in a machine… but, if you have root access, why bother? You already have full access to all accounts!

Posted in howto | Tagged: , , , , , | Leave a Comment »

Filelight makes my day

Posted by isilanes on February 7, 2008

First of all: yes, this could have been made with du. Filelight is just more visual.

The thing is that yesterday I noticed that my root partition was a bit on the crowded side (90+%). I though it could be because of /var/cache/apt/archives/, where all the installed .deb files reside, and started purging some unneeded installed packages (very few… I only install what I need). However, I decided to double check, and Filelight has given me the clue:

Filelight_root

(click to enlarge)

Some utter disaster in a printing job filled the /var/spool/cups/tmp/ with 1.5GB of crap! After deleting it, my root partition is back to 69% full, which is normal (I partitioned my disk with 3 roots of 7.5GB (for three simultaneous OS installations, if need be), a /home of 55GB, and a secondary disk of 250GB).

Simple problem, simple solution.

Posted in my ego and me | Tagged: , , , , | Leave a Comment »

App of the week: digiKam

Posted by isilanes on February 6, 2008

As digital cameras get more and more common, and personal photo collections grow bigger, solutions for managing all these images are more and more needed.

I bought my first digital camera (a Nikon CoolPix 2500) almost 4 years ago (now I see the model was 1 year old when I bought my unit), and now I own a Panasonic Lumix DMC FX10 I’m so happy with. I obviously have the need outlined above, plus the desire to sometimes share some pictures over the web. I didn’t want to go for something like Picasa, and made a lengthy Perl/Tk script to generate HTML albums from some info I would introduce.

When I later discovered digiKam, I realized it had all the features I wanted. It is incredibly useful to tag your pictures, so that you can later on retrieve, say, “all the pictures in which my father appears”. It also has many other features, like easy access to image manipulation (of which I only use the rotation for photos requiring it), or ordering of the pictures by date, so you can see how many pictures were taken each month. The humble, but for me killer, features is that you can automatically generate HTML albums from a list of pictures, which can be selected e.g. by their tags.

Give it a try, and you’ll love it.

Posted in Application of the Week | Tagged: , , , , , | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.