Sunday, August 17, 2008

svn: inconsistent line ending style

Today I was adding grails tutorials into SVN repository. Yes I know it should be in the repository long time ago :). And something that should be simple operation finished with the svn:inconsistent line ending style. For those who didn't still hit this problem, SVN finishes with this error if you have different line ending styles in the same file. And it will refuse to add such files into repository till it is not fixed. As there was more than one file with such problem (few hundreds of them) manual intervention was not an option. But to my surprise (after googling) I was not able to find how to fix it automatically for all the files. So I decided to write a groovy script that will fix it for me.

And without too much waiting groovy script is here:

if (!args) {
println "Usage: <path_to_directory>"
println "And: <existance of extensions.txt comma separated values file with file extensions to convert>"
return
}

Convert c = new Convert()
c.convert(args[0])


class Convert {
Set extensions = new HashSet()

public void convert(String path) {
extensions()

println "extensions to convert: ${extensions}"

File f = new File(path)
convertDir(f)
}

private void convertDir(File f) {
def sum = 0
def filesFound = 0
def filesVisited = 0
f.eachFileRecurse{File file ->
filesVisited = filesVisited + 1
if (shouldConvert(file)) {
filesFound = filesFound + 1
if (filesFound % 100 == 0) {
println "Files checked: ${filesFound}"
}
if (replaceLines(file)) {
sum = sum + 1
if (sum % 10 == 0)
println "Replaced eol in files: " + sum
}
}
}

println "Files converted: ${sum}"
println "Files checked: ${filesFound}"
println "Files visited: ${filesVisited}"
}

private boolean shouldConvert(File f) {
return extensions.contains(extension(f))
}

private String extension(File f) {
int idx = f.getName().lastIndexOf('.')
if (idx != -1) {
String result = f.getName().substring(idx + 1)
return result
} else {
return null
}
}

def replaceLines = {File f ->
String text = f.text

if (text.contains('\r\n') || text.contains('\r')) {
text = text.replaceAll('\r\n', '\n')
text = text.replaceAll('\r', '\n')
f.write(text)
return true
}
return false
}

private void extensions() {
File f = new File("extensions.txt")
String content = f.text
String[] str = content.split(",")
Set extensions = new HashSet(Arrays.asList(str))
extensions.each{
it = it.replaceAll('\r\n','')
this.extensions.add(it)
}
}
}

To be able to use this script you have to install groovy. Then in the same directory where you have your file you need to create extensions.txt file that contains list of file extensions that should be checked. Extensions should be comma separated without spaces in between.

Then run the script with groovy convert.groovy <path_to_directory>.

Now what is visible in this simple script is how groovy extensions to the java.io.File help us to work with files and directories. Actually if you go through code you will see that following methods have been used:


  • f.eachFileRecurse - will recursively traverse of files in the directory structure

  • f.text - will return you content of the whole file as String

  • f.write - will write string as a context to the file

Well you can use this script if you have the same problem but you know that script is provided as is and in the case of damage I will not feel responsible any way.

5 comments:

Anonymous said...

Linux/Unix is your friend...

find path to directory -name \*.extension | xargs [dos2unix|unix2dos]

Anonymous said...

Maybe Ant could have helped, too. It has the FixCRLF Task. Doc is here: http://ant.apache.org/manual/CoreTasks/fixcrlf.html

Anonymous said...

If working in Windows, you can use the Windows XP unix2dos command line utility:

C:\>unix2dos -D filename

jan said...

I didn't check but if utility unix2doc as a parameter accepts only file name that this can be a problem if you have to change unknown number of files.

Of course you can always write small script that will call this command on each file in given directory(s).

ZombieBlogger said...

Use Notepad++ and change the Format-->Convert to UNIX format.