As an R user, especially as one who leans heavily on packages and frameworks to do the heavy lifting for me, I have developed a bit of an insecurity about my ability to write code. I contracted this insecurity from hanging out in online spaces where other people who write code hangout, not just data scientists but webdev’s, data engineers, and whatever it is exactly that people who write C++ do. In these spaces the only time R comes up is when it is being compared to python, but mostly it is just absent from the conversation.
And I understand why this is the case. R is a very specialized language that has a specific function - analyze data. And I love using it for that, but I was worried I was spending my time learning a very specialized language with non-transferable skills. I wanted to ensure I was becoming good at the fundamental coding skills as well as expanding my analytical toolkit.
The source of this insecurity of mine was confusion about the sorts of things that I need to know. Code is a big world and I only work in a small corner of it. I did not have a clear distinction in my mind about what the difference was between the type of code it takes to do a good analysis and where traditional software development skills fit into this work.
However, after a bit of digging I found that my lack of clarity around this topic was actually a function of one of the primary goals of the original S language (the precursor to R):
“The ambiguity [of the S language] is real and goes to a key objective: we wanted users to be able to begin in an interactive environment, where they did not consciously think of themselves as programming. Then as their needs became clearer and their sophistication increased, they should be able to slide gradually into programming, when the language and system aspects would become more important.”
~ John Chambers, one of the creators of S
So the line between user and dev is not a hard line in the sand. It’s actually a spectrum. On one end there are the users of software and on the other there are developers.
I am really not sure where the line between developer and user is drawn, and I have no interest in drawing it. The last thing I want is contribute to any gatekeeping. I think it is fair to say when you are cleaning data with dplyr functions you are being a more of a user of software, and when you are designing your analysis pipeline you are using more developer skills. However, for me at least, it has been beneficial to be able to be aware of where you are on this ‘user-developer” spectrum as it can help you be clear on the skills and tools needed for an analysis.
Sometimes a project might be complex enough that it requires you to start leveraging the object oriented programming capabilities of R directly, and when this time comes it would be beneficial to know that you need a different set of skills than the ones you have developed as a user.
Different projects will require you to work at a different places on this spectrum. And you will side up and down this spectrum during the course of an analysis.